Skip to content

HIVE-29413: Generalise column related APIs in Table.java#6413

Open
ramitg254 wants to merge 26 commits into
apache:masterfrom
ramitg254:HIVE-29413
Open

HIVE-29413: Generalise column related APIs in Table.java#6413
ramitg254 wants to merge 26 commits into
apache:masterfrom
ramitg254:HIVE-29413

Conversation

@ramitg254
Copy link
Copy Markdown
Contributor

@ramitg254 ramitg254 commented Apr 7, 2026

What changes were proposed in this pull request?

added getEffectivePartCols() in most places possible to avoid code duplication.

Why are the changes needed?

getPartCols() does not have support for iceberg tables.

Does this PR introduce any user-facing change?

No

How was this patch tested?

ci tests and local build

@deniskuzZ
Copy link
Copy Markdown
Member

@ramitg254 please take a look: 9e7535c. I would suggest following similar approach

@ramitg254
Copy link
Copy Markdown
Contributor Author

ramitg254 commented Apr 10, 2026

9e7535c

but here we are creating separate method getEffectivePartCols() and leaving getPartCols() as it is, which as per our discussion on that closed pr we shouldn't do that, and only go ahead with updating getPartCols()

@deniskuzZ
Copy link
Copy Markdown
Member

deniskuzZ commented Apr 10, 2026

9e7535c

but here we are creating separate method getEffectivePartCols() and leaving getPartCols() as it is, which as per our discussion on that closed pr we shouldn't do that, and only go ahead with updating getPartCols()

Where did I say that? The ask was to keep the original method unchanged. same here

@ramitg254
Copy link
Copy Markdown
Contributor Author

ramitg254 commented Apr 10, 2026

oh I got confused due to this comment: #6337 (comment) in which getSupportedPartCols() was just separate method similar to getEffectivePartCols()

@ramitg254
Copy link
Copy Markdown
Contributor Author

ramitg254 commented Apr 10, 2026

I am fine with that earlier approach as well but recently I saw this one: https://issues.apache.org/jira/browse/HIVE-29525 so I thought we should have unified getPartCols() and getCols() which gives similar results as native hive tables as first step towards solving this after that those plan logics can be taken care of later on when that ticket will be addressed.
So I was first focussing on making getPartCols() unified for iceberg tables as well.

please share your thoughts on this idea

@ramitg254
Copy link
Copy Markdown
Contributor Author

ramitg254 commented Apr 25, 2026

I was planning to but updating getCols() will alone cause test failures for all q files whichever has describe command for iceberg tables and also query plans will itself get affected as stats logic current take this getCols() into account and there are around 90+ occurences of it in code so it will lead to breakage as well so I thought it will be better if we take care of it as a separate change

@deniskuzZ
Copy link
Copy Markdown
Member

I was planning to but updating getCols() will alone cause test failures for all q files whichever has describe command for iceberg tables and also query plans will itself get affected as stats logic current take this getCols() into account and there are around 90+ occurences of it in code so it will lead to breakage as well so I thought it will be better if we take care of it as a separate change

I guess that was the main intent — to integrate Iceberg partition handling into the existing code with minimal workarounds/code duplication.

Maybe I’m missing something, but, unfortunately, I don’t see much value in the current state of PR, sorry.
It doesn’t seem to enable any missing partition optimizations (there are no q-test changes), including the one mentioned above in HIVE-29525, and instead appears to be more of a partial refactor.

Let’s see what Krisztian thinks about it.

// TODO: make it configurable whether we want to include the table columns in the select query.
// It might make delete writes faster if we don't have to write out the row object
ListUtils.union(ACID_VIRTUAL_COLS_AS_FIELD_SCHEMA, table.getCols());
ListUtils.union(ACID_VIRTUAL_COLS_AS_FIELD_SCHEMA, table.getStorageSchemaCols());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drop this getStorageSchemaCols. union getCols and getPartitionCols

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 1, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants