Skip to content

Spark: Support aggregate pushdown for identity partition column GROUP BY#16176

Open
hemanthboyina wants to merge 3 commits intoapache:mainfrom
hemanthboyina:groupby_aggregate
Open

Spark: Support aggregate pushdown for identity partition column GROUP BY#16176
hemanthboyina wants to merge 3 commits intoapache:mainfrom
hemanthboyina:groupby_aggregate

Conversation

@hemanthboyina
Copy link
Copy Markdown
Contributor

This PR enables aggregate pushdown for queries with GROUP BY on identity partition columns. Currently, Iceberg supports pushing down aggregates (COUNT, MIN, MAX) for queries without GROUP BY, computing results from file metadata instead of reading data files. However, when a query includes GROUP BY, the pushdown is disabled even when the GROUP BY columns are identity partition fields.

@github-actions github-actions Bot added the spark label Apr 30, 2026
@singhpk234 singhpk234 requested a review from huaxingao May 2, 2026 02:24
Comment on lines +216 to +217
Map<List<Object>, AggregateEvaluator> evaluatorsByPartition =
groupFilesByPartition(spec, groupByPositions, boundAggregates);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not confident this is correct, plus we are just checking the recent partitioning, a table could comprise of lot of different partition spec files which evolved across snapshots

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @singhpk234 You raised a valid point. the current implementation only considers the current partition spec and bails out for files from different specs. Will look into handling spec evolution properly and update the PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handled partition spec evolution changes, can you please review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants