Skip to content

Implement metrics evaluators that work directly with ContentStats #16218

@anoopj

Description

@anoopj

Feature Request / Improvement

Problem

Followup from PR feedback.

The content file adapters convert ContentStats to v3 style bounds. Each call allocates a new map and, for bounds serializes every field value into a fresh ByteBuffer.

This is acceptable when a single evaluator calls each method once per file, but if multiple evaluators are used (e.g. InclusiveMetricsEvaluator + StrictMetricsEvaluator), the maps are rebuilt and discarded repeatedly.

Fix

Add metrics evaluators that work directly with ContentStats / FieldStats rather than going through the ContentFile bounds/counts maps. This avoids the intermediate map allocation and per-field serialization.

Once ContentStats-aware evaluators are available, ensure they are used in all v4 code paths (scan planning, manifest filtering, etc.) instead of falling back to the adapter's map-based methods.

Query engine

None

Willingness to contribute

  • I can contribute this improvement/feature independently
  • I would be willing to contribute this improvement/feature with guidance from the Iceberg community
  • I cannot contribute this improvement/feature at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    improvementPR that improves existing functionality

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions