Skip to content

[client] Move limit scan code to client's table api from flink.#2794

Open
loserwang1024 wants to merge 1 commit intoapache:mainfrom
loserwang1024:fluss-table-scanner
Open

[client] Move limit scan code to client's table api from flink.#2794
loserwang1024 wants to merge 1 commit intoapache:mainfrom
loserwang1024:fluss-table-scanner

Conversation

@loserwang1024
Copy link
Contributor

Purpose

Linked issue: close #2793

Brief change log

Tests

API and Format

Documentation

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a table-level batch scanning API to the Fluss client (to avoid Flink-side bucket iteration) and updates Flink limit pushdown to use the new client capability.

Changes:

  • Introduce Scan#createBatchScanner() and implement it in TableScan by building bucket scanners and combining them via CompositeBatchScanner.
  • Update Flink PushdownUtils.limitScan to use the new table-level batch scanner API.
  • Add unit/integration tests for CompositeBatchScanner and table-level limit scans.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/utils/PushdownUtils.java Switch Flink limit scan implementation to use client table-level createBatchScanner() API.
fluss-client/src/main/java/org/apache/fluss/client/table/scanner/Scan.java Add new table-level createBatchScanner() API to the public scan interface.
fluss-client/src/main/java/org/apache/fluss/client/table/scanner/TableScan.java Implement table-level batch scanner creation by enumerating buckets/partitions and composing scanners.
fluss-client/src/main/java/org/apache/fluss/client/table/scanner/batch/CompositeBatchScanner.java New composite batch scanner to unify multiple per-bucket scanners behind one BatchScanner.
fluss-client/src/test/java/org/apache/fluss/client/table/scanner/batch/CompositeBatchScannerTest.java New unit tests for composite scanner behavior (no-limit, limit, close).
fluss-client/src/test/java/org/apache/fluss/client/table/scanner/batch/BatchScannerITCase.java Rename IT case and add integration test for table-level scan with limit.
Comments suppressed due to low confidence (1)

fluss-client/src/test/java/org/apache/fluss/client/table/scanner/batch/BatchScannerITCase.java:343

  • This test claims to verify “respects limit” but asserts actual.size() >= limit. From an API perspective, Scan.limit(N) should return at most N rows; allowing more will surprise callers (especially since BatchScanUtils.collectRows returns all rows from the scanner). Consider updating the implementation to cap results at limit and tighten this assertion to <= limit (and keep the existing <= 9 bound).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@loserwang1024 loserwang1024 requested a review from wuchong March 5, 2026 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a batch scanner that can be used directly for the whole table

2 participants