Skip to content

Add support for skipping specific timescaleDB schemas during cleanup#553

Merged
rmpestano merged 4 commits intodatabase-rider:masterfrom
dfitisoff-k:fix/timescaledb-internal-tables-being-cleaned
Apr 30, 2023
Merged

Add support for skipping specific timescaleDB schemas during cleanup#553
rmpestano merged 4 commits intodatabase-rider:masterfrom
dfitisoff-k:fix/timescaledb-internal-tables-being-cleaned

Conversation

@dfitisoff-k
Copy link
Copy Markdown
Contributor

Why

Currently, when using timescaledb (which is built as an extension for PostgreSQL), DbRider cleans its internal tables, which breaks subsequent tests. (see repo with reproduction https://github.com/dfitisoff-k/database-rider-timescaledb-issues)

Description

  • Added one property to dbunit config - skipSchemas which would allow to skip reading tables from user-specified schemas

@dfitisoff-k
Copy link
Copy Markdown
Contributor Author

dfitisoff-k commented Apr 25, 2023

Maybe it's not the best solution, but the current flow doesn't allow to skip tables in timescaledb's internal schemas.

As an alternative, I guess, it's possible to introduce new DBType TIMESCALEDB and resolve it by checking first the driver, and if the driver is POSTGRES, check schemas in DB also (_timescaledb_catalog for example). This way it will be possible to just add set of timescaledb internal schemas as system schemas (just like for H2 or MySQL).

@rmpestano
Copy link
Copy Markdown
Member

Hey @dfitisoff-k, thanks a lot for the PR!

As an alternative, I guess, it's possible to introduce new DBType TIMESCALEDB and resolve it by checking first the driver, and if the driver is POSTGRES, check schemas in DB also (_timescaledb_catalog for example). This way it will be possible to just add set of timescaledb internal schemas as system schemas (just like for H2 or MySQL).

If the list of tables and schemas is constant I think we should add TIMESCALEDB db type. Let's keep DBUnitConfig for dynamic things.

Also if we opt for adding a new type I think we could extract the list if reserved words and schemas from DBUnitConfig#Constants to specific classes like MYSqlConfig, TimesSaleConfig and so on but I can do the refactoring later, let's focus on fixing Timescale for now, what do you think?

Cheers!

@dfitisoff-k
Copy link
Copy Markdown
Contributor Author

dfitisoff-k commented Apr 28, 2023

@rmpestano I changed the flow a little bit to fix timescaledb (don't like the approach, it needs refactoring). However, I can't really add any test or test it locally (I guess, some problems with my setup).
I believe, the whole database type resolving approach could be refactored so the new databases can be added easily. There is a list of databases (https://wiki.postgresql.org/wiki/PostgreSQL_derived_databases) which are made on-top of postgres, so I think the same problem can occur when using database-rider with them.
We can extract one resolver which will check database driver first, and we can have registry of resolvers that will try to narrow down dbtype
Resolvers hierarchy
database-resolvers
Resolving process
database-resolving-process

@rmpestano
Copy link
Copy Markdown
Member

Thanks for the diagram @dfitisoff-k, maybe I'm overlooking so let's make sure we're in the same page, this is the piece of code that later I want to move to each, lets call it "DBConstants" for now:

 SYSTEM_SCHEMAS.put(RiderDataSource.DBType.MSSQL, Collections.singleton("SYS"));
            SYSTEM_SCHEMAS.put(RiderDataSource.DBType.H2, Collections.singleton("INFORMATION_SCHEMA"));
            final Set<String> timescaledbSchemas = new HashSet<>(Arrays.asList(
                "_timescaledb_cache",
                "_timescaledb_catalog",
                "_timescaledb_internal",
                "_timescaledb_config",
                "timescaledb_information",
                "timescaledb_experimental"
            ));
            SYSTEM_SCHEMAS.put(RiderDataSource.DBType.TIMESCALEDB, Collections.unmodifiableSet(timescaledbSchemas));

I see only a list of system schemas being DBType specific so let's say "DBConstants" would be an interface which for now would have just a getSystemSchemas method.

In DBUnitConfig we would have only a instance variable of DBConstants and maybe we could just have a factory to resolve DBConstants impl in DBUnitConfig init method, WDYT? am I missing something?

@rmpestano
Copy link
Copy Markdown
Member

By the way, thanks for the contribution, merging it now and I'll release it once we have #555

@rmpestano rmpestano merged commit 433e065 into database-rider:master Apr 30, 2023
@dfitisoff-k
Copy link
Copy Markdown
Contributor Author

In DBUnitConfig we would have only a instance variable of DBConstants and maybe we could just have a factory to resolve DBConstants impl in DBUnitConfig init method, WDYT? am I missing something?

I dunno, maybe I'm not really that deep into database-rider design, but I guess DBUnitConfig main responsibility is to hold user defined configuration. I'd rather made this DBConstants instance a part of context and resolved it during context creation.

By the way, @rmpestano is there any chat or something like that to join? I'd be glad to contribute to database-rider and would like to know is there any roadmap or so

@rmpestano rmpestano changed the title Add property allowing to skip reading tables from specified schemas Add support for skipping specific timescaleDB schemas during cleanup May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants