Skip to content

[Fix][API] Add missing OptionRule validation for CatalogFactory creation path#11127

Open
nzw921rx wants to merge 2 commits into
apache:devfrom
nzw921rx:fix/catalog-factory-option-validation
Open

[Fix][API] Add missing OptionRule validation for CatalogFactory creation path#11127
nzw921rx wants to merge 2 commits into
apache:devfrom
nzw921rx:fix/catalog-factory-option-validation

Conversation

@nzw921rx

Copy link
Copy Markdown
Collaborator

Purpose

All factory creation paths in SeaTunnel validate user config against optionRule() before instantiation — except Catalog:

Factory Type Validates optionRule() Entry Point
Source FactoryUtil.createSource()
Sink FactoryUtil.createSink()
Transform FactoryUtil.createTransform()
Catalog FactoryUtil.createOptionalCatalog()

This is a framework-level bug: all 37 CatalogFactory implementations silently skip config validation. When users provide invalid configs (e.g. missing required username/password for JDBC catalogs, or missing hosts for Elasticsearch), the error is not caught upfront — it surfaces later as a cryptic NullPointerException or connection failure deep in the catalog implementation.

This PR fixes the gap by adding ConfigValidator.validate(optionRule()) to createOptionalCatalog().

Changes

  • Add ConfigValidator.of(readonlyConfig).validate(catalogFactory.optionRule()) in FactoryUtil.createOptionalCatalog() before calling createCatalog().
  • Fix LanceCatalogFactory.optionRule() returning null → return empty OptionRule.builder().build() to prevent NPE.
  • Add 4 unit tests in CatalogTableUtilTest covering valid config, missing/partial required options, and unknown factory identifier.

How was this patch tested?

  • ./mvnw spotless:apply
  • ./mvnw -pl seatunnel-api -Dtest=CatalogTableUtilTest test
  • ./mvnw -q -DskipTests verify

@DanielLeens DanielLeens left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I reviewed the full current diff from the actual catalog creation path instead of treating this as a test-only cleanup.

What this PR fixes

  • User pain: CatalogFactory.optionRule() was declared by some catalogs, but FactoryUtil.createOptionalCatalog(...) did not validate it, so missing required catalog options could fail later and less consistently than the existing source/sink factory paths.
  • Fix approach: validate the catalog factory OptionRule at the createOptionalCatalog(...) entry point, add regression coverage for valid / missing / partial / unknown-factory cases, and make LanceCatalogFactory.optionRule() return an empty rule instead of null.
  • One-line summary: this aligns catalog creation with the existing factory validation contract, and I did not find a source-level blocker in the latest head.

Runtime path I checked

CatalogTableUtil.getCatalogTables(...)
  -> no explicit schema branch
  -> FactoryUtil.createOptionalCatalog(factoryId, readonlyConfig, classLoader, factoryId)
      -> discoverOptionalFactory(...)
      -> ConfigValidator.of(readonlyConfig).validate(catalogFactory.optionRule())
      -> catalogFactory.createCatalog(...)
  -> catalog.open()
  -> catalog.getTables(readonlyConfig)

Key findings

  1. The normal path really does hit this change. CatalogTableUtil.getCatalogTables(...) delegates into FactoryUtil.createOptionalCatalog(...) whenever the schema is not explicitly provided.
  2. This is a precise fix, not a new validation model. It brings CatalogFactory into the same ConfigValidator + OptionRule contract that source and sink factories already use.
  3. I rechecked ConfigValidator: this call uses validate(rule), not validateUnknownKeys(...), so it enforces required/value constraints without suddenly rejecting the broader ReadonlyConfig object for unrelated outer keys.
  4. Updating LanceCatalogFactory.optionRule() to OptionRule.builder().build() is the necessary companion change so the new validation path does not trip over a null rule.

Other reviewer / maintainer input

  • There were no prior non-Daniel reviews or comment threads on this PR when I reviewed it, so there was nothing to de-duplicate here.

Testing / stability

  • The added tests in CatalogTableUtilTest cover the key entry cases: valid config, missing required options, partially missing required options, and unknown factory.
  • The new tests are structurally stable: pure unit tests, no timing, ports, threads, or external services.
  • I did not run local Maven in this batch; this is a source-level PR review only.
  • GitHub Build was still queued when I reviewed.

Merge conclusion: can merge

  1. Blocking items
  • None from my side at the source level.
  1. Suggested follow-up
  • No additional source changes needed. Please just let the normal Build check finish green before merge.

Overall, this is a small and clean API-layer fix. The runtime path is real, the scope is tight, and the tests are aimed at the exact contract this PR changes.

@DanielLeens DanielLeens left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this gap. I reviewed the current diff from CatalogTableUtil / FactoryUtil.createOptionalCatalog() down into the actual CatalogFactory implementations that are affected, rather than only looking at the new tests.

What this PR solves:

  • User pain: catalog configs were the odd one out in SeaTunnel. Source, sink, and transform factories validate optionRule() before creation, but catalog factories did not, so missing required options surfaced later as deeper runtime errors instead of a clear upfront validation failure.
  • Fix approach: validate the catalog factory option rule before createCatalog(), and patch the factories that would otherwise break that new path (LanceCatalogFactory returning null, plus DuckDB using the wrong JDBC catalog rule for its real config contract).
  • In plain words: bad catalog config now fails early and clearly, instead of blowing up later in a less understandable place.

Call path I checked:

CatalogTableUtil / catalog creation helpers
  -> FactoryUtil.createOptionalCatalog(...)
      -> discoverOptionalFactory(..., CatalogFactory.class, factoryIdentifier)
      -> ConfigValidator.validate(catalogFactory.optionRule())
      -> catalogFactory.createCatalog(...)

I did not find a blocking issue in the current revision. The behavior change is user-visible, but it is the right direction and it is still backward-compatible for valid configs: only previously-invalid catalog configs now fail earlier. The Lance and DuckDB follow-up fixes are also necessary for this new validation path to be safe. From Daniel's side this can move forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants