[Fix][API] Add missing OptionRule validation for CatalogFactory creation path#11127
[Fix][API] Add missing OptionRule validation for CatalogFactory creation path#11127nzw921rx wants to merge 2 commits into
Conversation
DanielLeens
left a comment
There was a problem hiding this comment.
Thanks for the contribution. I reviewed the full current diff from the actual catalog creation path instead of treating this as a test-only cleanup.
What this PR fixes
- User pain:
CatalogFactory.optionRule()was declared by some catalogs, butFactoryUtil.createOptionalCatalog(...)did not validate it, so missing required catalog options could fail later and less consistently than the existing source/sink factory paths. - Fix approach: validate the catalog factory
OptionRuleat thecreateOptionalCatalog(...)entry point, add regression coverage for valid / missing / partial / unknown-factory cases, and makeLanceCatalogFactory.optionRule()return an empty rule instead ofnull. - One-line summary: this aligns catalog creation with the existing factory validation contract, and I did not find a source-level blocker in the latest head.
Runtime path I checked
CatalogTableUtil.getCatalogTables(...)
-> no explicit schema branch
-> FactoryUtil.createOptionalCatalog(factoryId, readonlyConfig, classLoader, factoryId)
-> discoverOptionalFactory(...)
-> ConfigValidator.of(readonlyConfig).validate(catalogFactory.optionRule())
-> catalogFactory.createCatalog(...)
-> catalog.open()
-> catalog.getTables(readonlyConfig)
Key findings
- The normal path really does hit this change.
CatalogTableUtil.getCatalogTables(...)delegates intoFactoryUtil.createOptionalCatalog(...)whenever the schema is not explicitly provided. - This is a precise fix, not a new validation model. It brings
CatalogFactoryinto the sameConfigValidator + OptionRulecontract that source and sink factories already use. - I rechecked
ConfigValidator: this call usesvalidate(rule), notvalidateUnknownKeys(...), so it enforces required/value constraints without suddenly rejecting the broaderReadonlyConfigobject for unrelated outer keys. - Updating
LanceCatalogFactory.optionRule()toOptionRule.builder().build()is the necessary companion change so the new validation path does not trip over a null rule.
Other reviewer / maintainer input
- There were no prior non-Daniel reviews or comment threads on this PR when I reviewed it, so there was nothing to de-duplicate here.
Testing / stability
- The added tests in
CatalogTableUtilTestcover the key entry cases: valid config, missing required options, partially missing required options, and unknown factory. - The new tests are structurally stable: pure unit tests, no timing, ports, threads, or external services.
- I did not run local Maven in this batch; this is a source-level PR review only.
- GitHub
Buildwas still queued when I reviewed.
Merge conclusion: can merge
- Blocking items
- None from my side at the source level.
- Suggested follow-up
- No additional source changes needed. Please just let the normal
Buildcheck finish green before merge.
Overall, this is a small and clean API-layer fix. The runtime path is real, the scope is tight, and the tests are aimed at the exact contract this PR changes.
DanielLeens
left a comment
There was a problem hiding this comment.
Thanks for fixing this gap. I reviewed the current diff from CatalogTableUtil / FactoryUtil.createOptionalCatalog() down into the actual CatalogFactory implementations that are affected, rather than only looking at the new tests.
What this PR solves:
- User pain: catalog configs were the odd one out in SeaTunnel. Source, sink, and transform factories validate
optionRule()before creation, but catalog factories did not, so missing required options surfaced later as deeper runtime errors instead of a clear upfront validation failure. - Fix approach: validate the catalog factory option rule before
createCatalog(), and patch the factories that would otherwise break that new path (LanceCatalogFactoryreturningnull, plus DuckDB using the wrong JDBC catalog rule for its real config contract). - In plain words: bad catalog config now fails early and clearly, instead of blowing up later in a less understandable place.
Call path I checked:
CatalogTableUtil / catalog creation helpers
-> FactoryUtil.createOptionalCatalog(...)
-> discoverOptionalFactory(..., CatalogFactory.class, factoryIdentifier)
-> ConfigValidator.validate(catalogFactory.optionRule())
-> catalogFactory.createCatalog(...)
I did not find a blocking issue in the current revision. The behavior change is user-visible, but it is the right direction and it is still backward-compatible for valid configs: only previously-invalid catalog configs now fail earlier. The Lance and DuckDB follow-up fixes are also necessary for this new validation path to be safe. From Daniel's side this can move forward.
Purpose
All factory creation paths in SeaTunnel validate user config against
optionRule()before instantiation — except Catalog:optionRule()FactoryUtil.createSource()FactoryUtil.createSink()FactoryUtil.createTransform()FactoryUtil.createOptionalCatalog()This is a framework-level bug: all 37 CatalogFactory implementations silently skip config validation. When users provide invalid configs (e.g. missing required
username/passwordfor JDBC catalogs, or missinghostsfor Elasticsearch), the error is not caught upfront — it surfaces later as a crypticNullPointerExceptionor connection failure deep in the catalog implementation.This PR fixes the gap by adding
ConfigValidator.validate(optionRule())tocreateOptionalCatalog().Changes
ConfigValidator.of(readonlyConfig).validate(catalogFactory.optionRule())inFactoryUtil.createOptionalCatalog()before callingcreateCatalog().LanceCatalogFactory.optionRule()returningnull→ return emptyOptionRule.builder().build()to prevent NPE.CatalogTableUtilTestcovering valid config, missing/partial required options, and unknown factory identifier.How was this patch tested?
./mvnw spotless:apply./mvnw -pl seatunnel-api -Dtest=CatalogTableUtilTest test./mvnw -q -DskipTests verify