Skip to content

Antalya 26.3: Fix condition for using parquet metadata cache#1751

Open
zvonand wants to merge 5 commits into
antalya-26.3from
feature/antalya-26.3/pr-1631
Open

Antalya 26.3: Fix condition for using parquet metadata cache#1751
zvonand wants to merge 5 commits into
antalya-26.3from
feature/antalya-26.3/pr-1631

Conversation

@zvonand
Copy link
Copy Markdown
Collaborator

@zvonand zvonand commented May 6, 2026

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix apache iceberg queries not hitting the parquet metadata cache (#1631 by @arthurpassos).

CI/CD Options

Exclude tests:

  • Fast test
  • Integration Tests
  • Stateless tests
  • Stateful tests
  • Performance tests
  • All with ASAN
  • All with TSAN
  • All with MSAN
  • All with UBSAN
  • All with Coverage
  • All with Aarch64
  • All Regression
  • Disable CI Cache

Regression jobs to run:

  • Fast suites (mostly <1h)
  • Aggregate Functions (2h)
  • Alter (1.5h)
  • Benchmark (30m)
  • ClickHouse Keeper (1h)
  • Iceberg (2h)
  • LDAP (1h)
  • Parquet (1.5h)
  • RBAC (1.5h)
  • SSL Server (1h)
  • S3 (2h)
  • S3 Export (2h)
  • Swarms (30m)
  • Tiered Storage (2h)

Cherry-picked from #1631.


Apache Iceberg queries were not htiting the parquet metadata cache because object_info->getFileFormat() resolves to IcebergDataObjectInfo::getFileFormat, which gets its return value from IcebergObjectSerializableInfo. This field is filled with the value from Apache Iceberg manifest file, and it is upper case by default, which then fails clickhouse check for parquet metadata cache usage.

Documentation entry for user-facing changes

...

@zvonand zvonand added releasy Created/managed by RelEasy ai-resolved Port conflict auto-resolved by Claude labels May 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Workflow [PR], commit [ce406bd]

zvonand and others added 2 commits May 7, 2026 13:49
…next commit)

---
Original cherry-pick message follows:

Merge pull request #1631 from Altinity/arthurpassos-patch-11

Fix condition for using parquet metadata cache
# Conflicts:
#	src/Storages/ObjectStorage/StorageObjectStorageSource.cpp
#	tests/integration/test_storage_iceberg_with_spark/test_read_constant_columns_optimization.py
@zvonand zvonand force-pushed the feature/antalya-26.3/pr-1631 branch from 98cefd0 to 304b298 Compare May 7, 2026 11:52
arthurpassos
arthurpassos previously approved these changes May 7, 2026
Copy link
Copy Markdown
Collaborator

@arthurpassos arthurpassos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Selfeer
Copy link
Copy Markdown
Collaborator

Selfeer commented May 12, 2026

AI audit note: This review comment was generated by AI (Claude Opus 4.5).

Audit update for PR #1751

Antalya 26.3: Fix condition for using parquet metadata cache

Confirmed defects

High: Case comparison mismatch in parquet metadata cache guard

Field Description
Impact Parquet metadata cache is never engaged for ANY file format (including Iceberg), negating the entire fix. Query performance for Iceberg/Parquet files remains degraded with unnecessary object storage requests.
Anchor src/Storages/ObjectStorage/StorageObjectStorageSource.cpp / createReader (line 887 in PR branch)
Trigger Any query reading Parquet files with use_parquet_metadata_cache=1 (default enabled)
Why defect Poco::toLower(...) returns lowercase string "parquet", but the comparison is against "Parquet" (capital P), so the condition always evaluates to false.
Fix direction Change comparison from == "Parquet" to == "parquet" (lowercase) to match the original PR #1631
Regression test direction The test changes S3 multiplier from 3 to 2 (expecting cache hit); with this bug the cache is never used, so tests will fail on S3 with multiplier still being 3.

Coverage summary

Category Status
Scope reviewed Cherry-pick of PR #1631 - single-line fix in StorageObjectStorageSource.cpp + test adjustments
Categories failed String case comparison (direct code defect in cherry-pick)
Categories passed Thread safety (n/a - no shared state changes), Memory safety (n/a - no lifetime changes), Exception safety (n/a - no new exception paths), Format dispatch (FormatFactory::getCreators already uses to_lower internally, so format name passed to getInput* is case-insensitive)
Assumptions/limits Audit limited to the diff between antalya-26.3 base and the PR branch; did not audit unrelated changes merged into the branch (e.g., ObjectStorageListObjectsCache additions).

@Selfeer
Copy link
Copy Markdown
Collaborator

Selfeer commented May 12, 2026

PR #1751 CI Triage

PR: #1751 - Antalya 26.3: Fix condition for using parquet metadata cache
CI Report: ci_run_report.html
Date: 2026-05-12

Summary

Category Count Tests
PR-caused regression 24 Iceberg Azure tests, 03707_parquet_metadata_cache
Pre-existing flaky ~200 Parquet MySQL/PostgreSQL datetime, Swarms, s3_export_partition, s3_export_part
Infrastructure 0 -

PR Changes

The PR modifies StorageObjectStorageSource.cpp to fix a case-sensitivity issue with parquet metadata cache for Iceberg queries:

// Before:
object_info->getFileFormat().value_or(configuration->getFormat()) == "Parquet"

// After:
Poco::toLower(object_info->getFileFormat().value_or(...)) == "Parquet"

The fix addresses the issue where Apache Iceberg manifest files return file format in uppercase (PARQUET), which failed the cache check.

New Fails in PR (Potentially Caused by PR Changes)

1. 03707_parquet_metadata_cache (Stateless tests)

Jobs affected:

  • Stateless tests (amd_debug, distributed plan, s3 storage, sequential)
  • Stateless tests (amd_debug, sequential)

Analysis: This test specifically validates parquet metadata cache behavior, directly related to the PR change. There may be a bug in the fix - comparing lowercase output to mixed-case "Parquet" instead of "parquet":

Poco::toLower(...) == "Parquet"  // Potentially wrong: lowercase vs mixed-case

2. Iceberg Azure Integration Tests (22 failures)

Test: test_storage_iceberg_with_spark/test_read_constant_columns_optimization.py

All failing variants:

  • test_read_constant_columns_optimization[False-azure-*] (10 variants)
  • test_read_constant_columns_optimization[True-azure-*] (10 variants)
  • Plus 2 variants in amd_binary job

Analysis: The PR includes a test update acknowledging that Azure doesn't populate etag, causing cache guard to always fail:

AzureObjectStorage::getObjectMetadata does NOT populate etag, so the cache guard !etag.empty() in StorageObjectStorageSource::createReader always fails for Azure

The test assertion was updated but all Azure variants still fail, suggesting the assertion change may be incorrect or there's an underlying issue.

Pre-existing Flaky Tests (Unrelated to PR)

Parquet Regression Suite (10 failed features)

Root cause: DateTime → DateTime64(0) and Date → Date32 type inference changes in version 26.x

Example failure:

Expected: {"datetime":"2106-02-07 06:28:15","toTypeName(datetime)":"DateTime"}
Actual:   {"datetime":"2106-02-07 06:28:15","toTypeName(datetime)":"DateTime64(0)"}

Affected tests:

  • mysql_function_to_parquet_file_to_mysql_function
  • postgresql_function_to_parquet_file_to_postgresql_function

This is a version-specific behavioral change in ClickHouse 26.x, not caused by this PR.

S3 Export Partition (5 failed scenarios)

Root cause: allow_experimental_export_merge_tree_partition setting not enabled in test config

Error:

Code: 344. DB::Exception: Exporting merge tree partition is experimental. 
Set the server setting `allow_experimental_export_merge_tree_partition` to enable it (on all replicas).

Pre-existing test configuration issue.

S3 Export Part (2 failed scenarios)

Test: json_columns, json_columns_with_hints

Root cause: S3 storage doesn't support JSON column type

Error:

Code: 44. DB::Exception: Cannot create table with column of type Dynamic or JSON, 
because storage S3 doesn't support columns with dynamic structure. (ILLEGAL_COLUMN)

Pre-existing test/feature compatibility issue.

Swarms (111 failed scenarios)

Failure types:

  • ExpectTimeoutError: Timeout 600.000s - Node failure tests timing out
  • Database does not exist errors
  • Join result assertion failures

These are flaky node failure and stress tests unrelated to parquet cache changes.

Recommendations

  1. Fix potential case comparison bug in StorageObjectStorageSource.cpp:

    // Change line 887 from:
    Poco::toLower(...) == "Parquet"
    // To:
    Poco::toLower(...) == "parquet"
  2. Verify 03707_parquet_metadata_cache passes locally with the fix before re-running CI

  3. Investigate Iceberg Azure test failures - verify if the updated assertion logic is correct given Azure's lack of etag population

  4. All other failures are pre-existing and should not block merge after the cache fix is corrected

Related Links

@zvonand
Copy link
Copy Markdown
Collaborator Author

zvonand commented May 12, 2026

this is a forward-port of an existing PR, thus not paying attention to the defects found. the defects shall be addressed by the original PR author in a separate PR

@zvonand zvonand added the forwardport This is a frontport of code that existed in previous Antalya versions label May 12, 2026
@arthurpassos
Copy link
Copy Markdown
Collaborator

this is a forward-port of an existing PR, thus not paying attention to the defects found. the defects shall be addressed by the original PR author in a separate PR

The original version is incorrect, the issue is regarding the port.

Original changes have getFormat())) == "parquet", while the port has getFormat())) == "Parquet". Notice the capital P. It should be all lower case.

@zvonand
Copy link
Copy Markdown
Collaborator Author

zvonand commented May 12, 2026

The original version is incorrect, the issue is regarding the port.

Can you please fix the PR then?

@arthurpassos
Copy link
Copy Markdown
Collaborator

The original version is incorrect, the issue is regarding the port.

Can you please fix the PR then?

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-resolved Port conflict auto-resolved by Claude forwardport This is a frontport of code that existed in previous Antalya versions releasy Created/managed by RelEasy

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants