Skip to content

[Spark] Spark streaming rate limit#2776

Open
addu390 wants to merge 2 commits intoapache:mainfrom
addu390:spark-streaming-rate-limit
Open

[Spark] Spark streaming rate limit#2776
addu390 wants to merge 2 commits intoapache:mainfrom
addu390:spark-streaming-rate-limit

Conversation

@addu390
Copy link

@addu390 addu390 commented Mar 2, 2026

Purpose

Linked issue: close #2550

Add rate limit support for Spark streaming reads to control the number of offsets processed per micro-batch trigger.

Brief change log

  • Added scan.max.offsets.per.trigger, scan.min.offsets.per.trigger, and scan.max.trigger.delay config options in SparkFlussConf
  • Override getDefaultReadLimit in FlussMicroBatchStream to return appropriate ReadLimit based on config
  • Note: Offset capping uses proportional fair-share distribution across buckets. A simpler, more typical approach (maxOffsets / numBuckets) can be used instead, if that's preferred.

Tests

  • SparkStreamingTest#read: log table with maxOffsetsPerTrigger rate limit

API and Format

New user-facing config options for Spark DataFrameReader:

  • scan.max.offsets.per.trigger
  • scan.min.offsets.per.trigger
  • scan.max.trigger.delay

Documentation

N/A, documentation update to be tracked separately.

@addu390 addu390 changed the title Spark streaming rate limit [Spark] Spark streaming rate limit Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[spark] Add rate limit for streaming read

1 participant