[Improvement](mysql) support create/drop database#64043
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
Found one blocking correctness issue. The remote MySQL DDL is executed, but the Doris external catalog cache is not invalidated or replay-logged, unlike existing Hive/Iceberg/Paimon create/drop database paths. This can leave SHOW DATABASES / getDbNullable stale on the leader until a manual refresh and can leave follower FEs stale because no edit-log replay hook runs.
Critical checkpoints:
- Goal/test: The PR aims to support CREATE/DROP DATABASE for MySQL JDBC catalogs and adds an external regression test, but the test manually refreshes the catalog after DDL and therefore misses the immediate-visibility/cache behavior expected by existing external DDL paths.
- Scope/focus: The implementation is small and focused, but it bypasses ExternalCatalog metadataOps post-DDL cache/replay handling.
- Concurrency/lifecycle: No new concurrency primitives; the issue is metadata cache lifecycle/invalidating initialized catalog state.
- Configuration/compatibility/protocol: No new config or FE-BE protocol/storage compatibility concerns.
- Parallel paths: Existing external metadata paths call afterCreateDb()/afterDropDb() and log create/drop DB info for replay; the new PluginDrivenExternalCatalog override does neither.
- Testing: Regression covers remote DDL after manual refresh, but not immediate visibility or stale-cache/follower replay behavior.
- Observability/performance: No additional observability issue found; performance impact is negligible.
User focus: No additional user-provided review focus was present.
|
/review |
There was a problem hiding this comment.
I found one blocking correctness issue in the latest revision. The PR goal is clear and the MySQL JDBC CREATE/DROP SQL path is small, and the regression test covers the happy path plus basic IF EXISTS/IF NOT EXISTS behavior. However, the new edit-log replay implementation for plugin-driven catalogs now performs external catalog initialization during replay, which is not safe for FE journal replay or restart.
Critical checkpoint conclusions:
- Goal/test: Adds CREATE/DROP DATABASE support for plugin-driven JDBC/MySQL catalogs; covered by a JDBC regression test, but the multi-FE replay path is not covered.
- Scope: Mostly focused, but replay behavior diverges from the existing ExternalCatalog replay contract.
- Concurrency/lifecycle: No new explicit shared mutable state, but replay lifecycle is problematic because it may initialize connector resources and rebuild remote metadata while applying journal entries.
- Configuration/compatibility: No new config or storage format; uses existing CreateDbInfo/DropDbInfo records.
- Parallel paths: Existing ExternalCatalog replay avoids remote initialization; this new plugin-driven override should preserve that behavior.
- Tests: Functional test exists for single-FE create/drop, but no failover/replay test covers the new persistence path.
- Observability/performance: Logging is adequate; replay currently does unnecessary remote IO.
- Transaction/persistence: Blocking issue below: journal replay must be local and equivalent, not dependent on external service availability.
User focus: no additional user-provided review focus was specified.
|
/review |
There was a problem hiding this comment.
I found a blocking persistence issue in the new plugin-driven external catalog DDL path. The remote CREATE/DROP succeeds before the FE edit log is written, but the edit-log failure is caught and only logged, so the client can receive success for a metadata mutation that followers and restarted FEs will never replay. This violates the EditLog fatal-error rule and can leave FEs with divergent catalog caches after failover/restart.
Critical checkpoints:
- Goal/test: the PR adds MySQL JDBC create/drop database support and a basic regression test, but the test does not cover edit-log failure/replay/failover behavior.
- Scope: the main functional change is small, but the custom edit-log handling diverges from the existing ExternalCatalog path.
- Concurrency: no new shared mutable state or locking issue found beyond cache invalidation/replay behavior already raised in existing threads.
- Lifecycle: replay no longer forces remote initialization, addressing the existing replay context; no additional lifecycle issue found.
- Configuration/compatibility: no new config or incompatible storage/protocol format found.
- Parallel paths: the existing ExternalCatalog create/drop path does not swallow log failures; plugin-driven path should follow that invariant.
- Tests: added regression covers happy path and simple errors only; no replay/failover/log-failure coverage.
- Observability: logs exist, but logging is not sufficient for durable-log failure.
- Persistence: blocking issue found; edit-log failures are swallowed.
User focus: no additional user-provided review focus.
|
/review |
There was a problem hiding this comment.
Review summary: no additional blocking issues found in this pass beyond the existing inline threads, which cover cache convergence, replay avoiding remote IO, and edit-log failure handling for plugin-driven catalog CREATE/DROP DATABASE.
Critical checkpoint conclusions:
- Goal and tests: The PR adds MySQL JDBC CREATE/DROP DATABASE support through the plugin-driven external catalog and includes an external JDBC regression case for create, duplicate create, drop, missing drop, and IF EXISTS/IF NOT EXISTS behavior. The current code accomplishes the intended path for MySQL.
- Scope and clarity: The change is focused on the connector metadata, MySQL SQL generation, catalog DDL delegation/replay, and one regression test.
- Concurrency: No new shared mutable state or lock ordering was introduced. The cache operations use existing ExternalCatalog cache invalidation paths.
- Lifecycle/replay: Current replay methods avoid initializing the connector or doing remote IO when the catalog is not initialized, matching the external catalog replay constraint.
- Configuration: No new configuration items were added.
- Compatibility: No storage format, thrift, or function symbol compatibility concern was found. Unsupported JDBC database types still fail through the existing connector exception path.
- Parallel paths: The implementation targets plugin-driven JDBC/MySQL. Other JDBC clients remain unsupported unless they override the new SQL generation hooks.
- Error handling and persistence: The current head no longer swallows edit-log failures in this file; durable-log failure is allowed to propagate.
- Test coverage: The added regression test covers the main MySQL behavior. I did not run tests in this review environment.
- Observability: Existing INFO logs include catalog and database names for the new DDL operations and are sufficient for this scope.
- Transactions/persistence/data correctness: The only persisted metadata state is the external catalog cache invalidation replay record; write and replay paths are present and local replay is side-effect-free against the remote source.
- Performance: No hot-path performance issue was identified; DDL paths perform one remote DDL call and a cache invalidation.
User focus: .opencode-review.tVlAf6/review_focus.txt says there was no additional user-provided review focus, so no extra focus-specific issue was found.
|
run buildall |
|
run external |
FE Regression Coverage ReportIncrement line coverage |
1 similar comment
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
|
run nonConcurrent |
|
run cloud_p0 |
|
run external |
TPC-H: Total hot run time: 29266 ms |
TPC-DS: Total hot run time: 169288 ms |
FE Regression Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
1 similar comment
FE Regression Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
Review result: no additional blocking issues found in the current head diff.
Critical checkpoint conclusions:
- Goal and tests: The PR adds JDBC/MySQL database CREATE/DROP support through the plugin-driven external catalog path, with a JDBC regression case covering create, drop, IF EXISTS, IF NOT EXISTS, and expected duplicate/missing errors. The current code appears to implement that goal.
- Scope: The change is small and focused on connector metadata, MySQL SQL generation, plugin-driven catalog DDL forwarding/cache invalidation, and one regression test.
- Concurrency/locking: I did not find new catalog/database/table lock usage or shared mutable state access beyond existing ExternalCatalog cache methods. The remote JDBC calls are not introduced under FE metadata locks in this path.
- Lifecycle/replay: Current replay methods avoid remote initialization/IO and only adjust initialized local caches, matching the no-remote-call replay requirement. This addresses the already-known replay concern and I am not re-raising it.
- Configuration/compatibility: No new config items or storage/protocol format changes are introduced. Existing CreateDbInfo/DropDbInfo journal types are reused.
- Parallel paths: Existing external metadata-ops paths already handle their catalog implementations; this PR adds the plugin-driven/JDBC path. Unsupported plugin connectors still fail via connector defaults.
- Special conditions: IF EXISTS / IF NOT EXISTS are forwarded into generated MySQL SQL. DROP force is not specially implemented, but MySQL DROP DATABASE already removes the schema contents, so I did not identify a correctness gap there.
- Test coverage: The new regression test covers the core MySQL JDBC behavior. I did not run the regression suite in this review runner.
- Persistence and failover: The current head no longer swallows edit-log failures and has write-side edit logs plus matching replay invalidation/removal. The previously raised persistence concerns are already tracked in existing threads and appear resolved in the current diff.
- Observability/performance: Added INFO logs are limited to explicit DDL operations, and no hot-path performance issue was identified.
User focus points: No additional user-provided review focus was specified.
There was a problem hiding this comment.
Summary: I found one blocking issue in the new plugin-driven JDBC DROP DATABASE path. Existing review context was considered; this is distinct from the earlier cache invalidation, replay IO, and edit-log failure comments.
Critical checkpoint conclusions:
- Goal/test: The PR adds JDBC create/drop database support and a MySQL regression test, but the implementation does not correctly handle external-catalog local-vs-remote database name resolution in non-default naming modes, and the test only covers the default unmapped path.
- Scope/focus: The code is small and focused.
- Concurrency/lifecycle: No new concurrent shared state or special lifecycle issue found beyond existing catalog initialization/replay paths already reviewed.
- Persistence/replay: Current head no longer swallows edit-log failures and replay avoids remote initialization; no new distinct persistence issue found.
- Compatibility/configuration: No new config or storage/protocol compatibility issue found.
- Parallel paths: Existing Hive/Iceberg/Paimon external drop paths resolve the local Doris database to the remote name before dropping; the new plugin-driven path does not.
- Tests: Missing coverage for
lower_case_database_names/meta_names_mappingbehavior, which is where the issue reproduces. The user provided no additional focus points, so there were no extra focus-specific issues to address.
|
/review |
There was a problem hiding this comment.
I found one remaining blocking issue in the added regression test. Existing review threads already cover the plugin-driven catalog cache/persistence/replay/remote-name concerns, so I did not duplicate those.
Critical checkpoint conclusions:
- Goal/test proof: the PR aims to support JDBC MySQL CREATE/DROP DATABASE through plugin-driven external catalogs and adds a regression test. The implementation path has targeted coverage, but the new test currently contains a cleanup statement that will fail after the catalog was already dropped.
- Scope/focus: the code changes are small and focused on JDBC schema DDL and plugin-driven external catalog bridging.
- Concurrency/lifecycle: no new concurrent state mutation beyond existing external catalog cache invalidation paths; replay avoids remote initialization in the current head.
- Config/compatibility: no new config or storage/protocol format compatibility concerns identified.
- Parallel paths: existing external catalog DDL paths were checked for analogous cache/log/drop behavior; previously raised concerns are already represented in existing threads.
- Tests: the added regression test has a deterministic functional bug in its final cleanup order and should be fixed before merge.
- Observability/performance: no additional blocking observability or performance issues found.
- User focus: no additional user-provided review focus was supplied.
|
/review |
There was a problem hiding this comment.
I found one blocking correctness issue in the new DROP DATABASE path.
Critical checkpoint conclusions:
- Goal/test: The PR aims to add JDBC MySQL CREATE/DROP DATABASE support with regression coverage. Basic create/drop is covered, but the lower_case_meta_names test currently encodes the buggy no-op behavior instead of catching it.
- Scope: The change is focused, but the PluginDrivenExternalCatalog override affects all plugin-driven catalogs and must preserve external identifier mapping semantics.
- Concurrency/lifecycle: No new shared mutable state or thread lifecycle was added; replay now avoids remote initialization, which is appropriate.
- Configuration/compatibility: No new config or storage format. Edit-log entries reuse existing CreateDbInfo/DropDbInfo and replay hooks.
- Parallel paths: Existing external DDL patterns use local-name cache invalidation plus remote-name resolution for DROP; the new path still misses local lookup normalization for lower_case_meta_names.
- Tests: The new external regression covers MySQL, but the mapped-case DROP assertion is inverted and does not prove the user-visible DROP behavior.
- Observability/performance: Added INFO logs are acceptable for DDL; no hot-path performance concern found.
User focus: No additional user-provided review focus was specified.
|
/review |
There was a problem hiding this comment.
I completed a full review of the PR files and the existing review context. I am not adding duplicate inline comments because the currently blocking findings are already covered by existing threads and still apply to this head.
Critical checkpoint conclusions:
- Goal/test: The PR aims to support CREATE/DROP DATABASE for MySQL JDBC catalogs and adds a regression test, but the current implementation/test are not yet sufficient because the existing threads identify remaining correctness and test-cleanup failures.
- Scope/focus: The change is small and focused on JDBC metadata plus PluginDrivenExternalCatalog, but it bypasses several established ExternalCatalog DDL semantics that existing comments already call out.
- Concurrency/lifecycle: No new explicit concurrent data structure was introduced, but replay/lifecycle behavior is critical here; existing replay comments correctly require journal replay to remain local and avoid remote initialization.
- Configuration: No new configuration items were added.
- Compatibility/storage format: No storage-format compatibility issue found. Edit-log compatibility is relevant and the existing persistence/replay comments cover the necessary corrections.
- Parallel paths: Existing external catalog DDL paths use ExternalMetadataOps hooks/edit-log replay; the plugin-driven path must preserve equivalent cache invalidation, remote-name resolution, and persistence semantics.
- Conditional checks: The lower-case/mapped-name drop path still needs to resolve caller input to the actual local name as already raised.
- Tests/results: A regression test is added, but an existing thread identifies a cleanup statement that can fail after dropping the catalog; another existing thread covers the lower_case_meta_names assertion behavior.
- Observability: Logging is present and no additional observability requirement was found beyond making failures propagate correctly.
- Transaction/persistence: Metadata persistence is a blocking area; existing comments about not swallowing edit-log failures and keeping replay local remain applicable.
- Data writes/modifications: Remote database create/drop has non-atomic interaction with FE metadata, so cache invalidation, local/remote name resolution, and edit-log behavior must be fixed before merge.
- FE/BE variable passing: Not applicable.
- Performance: No additional performance issue found in the small changed path.
- Other issues: No additional distinct issue found beyond the already-known inline threads.
User focus: No additional user-provided review focus was specified.
Please address the existing inline review threads before merge.
|
run buildall |
ebf3084 to
1282a8a
Compare
|
/review |
There was a problem hiding this comment.
Automated review summary for PR 64043 at 1282a8a.
I reviewed the full current diff and the existing inline review context first, and did not find additional distinct blocking issues beyond the already-known threads. The current head appears to address the previously raised concerns around plugin-driven external catalog cache invalidation, edit-log failure propagation, replay avoiding remote initialization, and local/remote database name resolution for lower-cased metadata names.
Critical checkpoint conclusions:
- Goal and tests: the PR adds MySQL JDBC external catalog CREATE/DROP DATABASE support and a JDBC regression covering create, duplicate create, drop, IF EXISTS/IF NOT EXISTS, mixed-case names, and lower_case_meta_names behavior.
- Scope: the implementation is focused on JDBC/plugin-driven catalog DDL plus the minimum connector SQL hooks needed for MySQL.
- Concurrency/lifecycle: no new explicit concurrency paths or long-held metadata locks were identified in the changed code; replay now stays local when the catalog is already initialized.
- Configuration/compatibility: no new configuration item or incompatible storage/protocol format was added.
- Parallel paths: existing external catalog metadataOps paths were compared; no additional required duplicate change was found for this MySQL JDBC path.
- Error handling/persistence: current head no longer swallows edit-log failures in the reviewed create/drop path.
- Test coverage: coverage is present for the main MySQL JDBC behavior. I did not run the regression in this Actions review environment.
- Observability: INFO logs include catalog/database context for the new DDL operations and no additional metric need was identified.
- User focus: no additional user-provided review focus was supplied.
Residual risk: this review was static only; the external MySQL regression should still be run in the JDBC-enabled regression environment before merge.
|
run buildall |
TPC-H: Total hot run time: 29439 ms |
TPC-DS: Total hot run time: 169626 ms |
|
run buildall |
TPC-H: Total hot run time: 29226 ms |
TPC-DS: Total hot run time: 169484 ms |
FE Regression Coverage ReportIncrement line coverage |
What problem does this PR solve?
mysql support create/drop database
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)