[fix](TIMESTAMPTZ)Prevents timezone-unsafe reparsing using getStringValue()#64026
[fix](TIMESTAMPTZ)Prevents timezone-unsafe reparsing using getStringValue()#64026starocean999 wants to merge 7 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
I found one blocking data-correctness issue in the partition literal refactor.
Critical checkpoint conclusions:
- Goal/test: The PR aims to make partition literals typed and timezone-safe, with unit and regression coverage for TIMESTAMPTZ boundaries. The TIMESTAMPTZ-focused paths are covered, but decimal/float partition ordering is not covered.
- Scope/focus: The change is mostly focused, but removing
PartitionValue.getValue(type)changes a broad partition-key invariant. - Concurrency/lifecycle/config/compatibility: No new concurrency, lifecycle, config, storage-format, or FE-BE protocol issues found.
- Parallel paths: Both range/list key creation paths were changed consistently, but they now share the same typing bug.
- Tests: Missing coverage for DECIMAL/DOUBLE partition values with different inferred literal scales/types.
- Data correctness: Blocking issue below can make partition comparisons non-antisymmetric, which can corrupt range validation/pruning decisions.
- Observability/performance/other: No additional blocking issue found.
User focus: No additional user-provided review focus was specified.
|
/review |
There was a problem hiding this comment.
Automated review summary:
I found two blocking correctness issues in the partition conversion changes. The TIMESTAMPTZ multi/step partition path now derives startString/endString from the typed legacy literal, whose string form includes fractional seconds and +00:00; the existing multi-partition date-format probing only accepts compact/date/datetime strings up to length 19, so valid TIMESTAMPTZ step partition DDL can fail before partition generation. Separately, Nereids LESS THAN translation now truncates extra boundary values with Math.min(...), which can turn an invalid range partition clause into a different valid catalog descriptor instead of preserving the values for catalog cardinality validation.
Critical checkpoint conclusions:
- Goal/test coverage: the PR generally addresses typed partition literal handling and TIMESTAMPTZ timezone-safe boundaries with added unit/regression tests, but the coverage does not include TIMESTAMPTZ step/multi partitions or invalid over-wide LESS THAN bounds.
- Scope/focus: the change is mostly focused, but removing late string parsing exposed several call paths that still depend on original partition literal text/arity.
- Concurrency/lifecycle/config/persistence: no new concurrency, lifecycle, dynamic config, or EditLog persistence concerns were identified in the reviewed paths.
- Compatibility/parallel paths: both legacy/catalog and Nereids partition construction paths are affected; the two comments identify parallel paths where the new typed conversion changes behavior.
- Error handling/data correctness: the TIMESTAMPTZ step partition path rejects valid DDL, and the LESS THAN path can silently accept a malformed boundary with changed semantics.
- Performance/observability: no material performance or observability issues found.
User focus: no additional user-provided review focus was supplied.
|
/review |
|
run buildall |
There was a problem hiding this comment.
I found additional correctness issues in separate partition translation paths. Existing threads already cover the PartitionKey literal coercion problem, TIMESTAMPTZ multi-partition formatting, and LESS THAN extra-boundary truncation/validation, so I did not duplicate those.
Critical checkpoint conclusions:
- Goal/test: the PR aims to preserve typed partition literals, especially TIMESTAMPTZ boundaries, and adds regression/unit coverage, but malformed partition cardinality cases are not consistently covered.
- Scope/focus: the approach is mostly focused, but the new typed casting was applied before validation in several entry points.
- Concurrency/lifecycle/config/compatibility: no new concurrency, special lifecycle, config, persistence format, or FE-BE protocol compatibility concern was found in the reviewed paths.
- Parallel paths: range LESS THAN received an explicit guard, but LIST, STEP, and ALTER multi-partition paths still have the same pre-validation cast hazard.
- Error handling: malformed DDL can now throw index/cast failures before the existing partition descriptor validation produces a controlled user-facing error.
- Tests: good TIMESTAMPTZ coverage was added, but negative cardinality tests for LIST/STEP/ALTER multi partitions are missing.
- Observability/performance: no additional observability need or material performance concern was found.
User focus: no additional user-provided review focus.
FE UT Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29121 ms |
TPC-DS: Total hot run time: 169494 ms |
FE Regression Coverage ReportIncrement line coverage |
|
/review |
|
run buildall |
TPC-H: Total hot run time: 29626 ms |
TPC-DS: Total hot run time: 169029 ms |
There was a problem hiding this comment.
I found one build/style issue in the added FE test. The functional changes otherwise look consistent with the stated goal, and the previously known review threads were not duplicated.
Critical checkpoint conclusions:
- Goal/test proof: The PR canonicalizes/typed-parses partition literals, especially TIMESTAMPTZ partition boundaries, and adds targeted unit/regression coverage. One added test has a Checkstyle indentation issue.
- Scope/focus: The changes are mostly focused on partition literal typing and related tests.
- Concurrency/lifecycle: No new concurrent state, locking, or special lifecycle management found in the reviewed paths.
- Configuration/compatibility: No new configs. I did not find a new FE-BE protocol or storage-format compatibility concern beyond the already-known review context.
- Parallel paths: Nereids CREATE/ALTER partition paths, MTMV partition rollup, and external partition cache conversions were checked. No additional distinct issue found.
- Validation/error handling: Earlier cardinality/cast hazards are already covered by existing threads and appear addressed in the current checkout; no duplicate comments submitted.
- Tests: Regression and unit coverage was added for TIMESTAMPTZ partition boundary behavior and malformed partition inputs, but the added Java test needs indentation fixed so FE style checks can pass.
- Observability/performance/transactions: No new observability requirement, hot-path performance issue, or transaction/persistence issue found.
User focus: No additional user-provided review focus was specified.
| alterMultiPartitionOp::getPartitionKeyDesc); | ||
| Assertions.assertTrue(exception.getMessage().contains("Invalid number format: 20.1")); | ||
| } | ||
|
|
There was a problem hiding this comment.
This method declaration is indented one level deeper than the surrounding test methods, which should trip FE Checkstyle's indentation rule. Please align it with the @Test annotation.
| public void testStepPartitionUsesCanonicalTimestampTzTextForMultiPartitionTranslation() throws Exception { |
|
run buildall |
TPC-H: Total hot run time: 29332 ms |
TPC-DS: Total hot run time: 168961 ms |
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29183 ms |
TPC-DS: Total hot run time: 169983 ms |
|
run buildall |
TPC-H: Total hot run time: 28865 ms |
TPC-DS: Total hot run time: 168974 ms |
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
Summary: Refactors partition literal handling to store typed LiteralExpr (removing string/long PartitionValue ctors), tightens literal comparison semantics, ensures Nereids partition expressions are cast to partition-column types before legacy conversion, fixes timezone-aware TIMESTAMPTZ microsecond rounding overflow, and resolves related build/checkstyle/test failures introduced during the refactor.
Partition API changes: PartitionValue now only accepts typed LiteralExpr (string/long constructors removed). Call sites updated to use LiteralExprUtils.createLiteral(...) or NullLiteral.create(type) for null/default partition values. This reduces late-parsing bugs and enforces consistent types.
Nereids → catalog conversion: Nereids partition nodes (e.g., InPartition, StepPartition, LessThanPartition, AlterMultiPartitionOp) now cast expressions to the table partition column types before converting to PartitionValue/legacy formats. This fixes partition-boundary/regression issues.
TIMESTAMPTZ fix: Prevents timezone-unsafe reparsing during microsecond rounding. DateTimeLiteral overflow/rounding now advances using internal date/time fields (not getStringValue() reparse), and TimestampTzLiteral behavior is kept timezone-correct. Added unit coverage (focused test) for the boundary case '9999-12-31 23:59:59.999999+08:00'.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)