Skip to content

AWS: Add MetricPublisher configuration support for S3FileIO (#15182)#16235

Open
yadavay-amzn wants to merge 2 commits intoapache:mainfrom
yadavay-amzn:fix/15182-metrics-publisher
Open

AWS: Add MetricPublisher configuration support for S3FileIO (#15182)#16235
yadavay-amzn wants to merge 2 commits intoapache:mainfrom
yadavay-amzn:fix/15182-metrics-publisher

Conversation

@yadavay-amzn
Copy link
Copy Markdown
Contributor

@yadavay-amzn yadavay-amzn commented May 6, 2026

Adds support for configuring a custom AWS SDK MetricPublisher for S3FileIO via the s3.metrics-publisher-impl catalog property. This enables observability into S3 SDK-level metrics (latency, retries, errors) for debugging and monitoring.

This PR builds on the design and initial implementation from #15122 by @rcjverhoef (PR stale-closed). The review feedback from @geruh on that PR — apply to the S3 async client, clean up fully-qualified imports, and add a test for invalid/missing classes — is addressed here.

Commits in this PR

  1. AWS: Add MetricPublisher configuration support for S3FileIO — co-authored with @rcjverhoef. Adds METRICS_PUBLISHER_IMPL property, the applyMetricsPublisherConfiguration method, wires it into both the sync s3() and async s3Async() paths, adds happy-path tests and docs.
  2. AWS: Improve MetricPublisher factory error reporting — splits factory lookup from factory invocation so the error message correctly identifies whether create(Map) was missing or present-but-threw. Adds regression tests for both throw paths.

Changes

  • S3FileIOProperties: Added METRICS_PUBLISHER_IMPL property, applyMetricsPublisherConfiguration() method that loads the publisher via a static create(Map) factory or, as a fallback, a no-arg constructor.
  • DefaultS3FileIOAwsClientFactory: Applied the metrics publisher configuration to both sync and async client builders.
  • Tests: 6 tests covering factory-method, no-arg constructor, disabled, invalid-class, factory-throws, and no-arg-constructor-throws paths.
  • Docs: Added a section in aws.md.

Usage

spark.sql.catalog.my_catalog.s3.metrics-publisher-impl=com.example.MyMetricPublisher

The class must implement software.amazon.awssdk.metrics.MetricPublisher and provide either a static create(Map<String, String>) factory method or a no-arg constructor.

Closes #15182

Anupam Yadav and others added 2 commits May 6, 2026 22:13
…5182)

Adds support for configuring a custom AWS SDK MetricPublisher for
S3FileIO via the s3.metrics-publisher-impl catalog property. This
enables observability into S3 SDK-level metrics (latency, retries,
errors) for debugging and monitoring.

- S3FileIOProperties: adds METRICS_PUBLISHER_IMPL property and the
  applyMetricsPublisherConfiguration() method. Loads the publisher
  via a static create(Map) factory when present, or a no-arg
  constructor otherwise.
- DefaultS3FileIOAwsClientFactory: applies the metrics publisher
  configuration to both the sync s3() and async s3Async() client
  builders.
- Tests cover factory-method, no-arg constructor, disabled, and
  invalid-class paths.

This PR builds on the design and initial implementation from apache#15122
by @rcjverhoef (PR stale-closed). Review feedback from @geruh on
that PR (apply to S3 async client, clean imports, invalid-class
test) is addressed here.

Closes apache#15182

Co-authored-by: rocco.verhoef <rcjverhoef@gmail.com>
Split the create(Map) factory lookup from its invocation so the error
message correctly identifies whether the factory method was missing
or present-but-threw. Previously, a create(Map) method that existed
and threw would be misreported as 'Cannot create MetricPublisher
from class X' with the factory exception as the cause, giving users
the false impression they needed to change the method signature when
the actual issue was a configuration error inside the factory.

The new error message states whether the failure occurred via
create(Map) or via the no-arg constructor, so users can diagnose the
right path without misreading the stack trace.

Add regression tests asserting the distinction (via create(Map) vs
via no-arg constructor) so the behaviour cannot silently regress.
@yadavay-amzn yadavay-amzn force-pushed the fix/15182-metrics-publisher branch from fdd699f to d5a9856 Compare May 6, 2026 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support configuring custom MetricPublisher for S3FileIO

1 participant