Skip to content

[helm] Enable metrics reporting on helm charts#2711

Open
morazow wants to merge 6 commits intoapache:mainfrom
morazow:mor-2677
Open

[helm] Enable metrics reporting on helm charts#2711
morazow wants to merge 6 commits intoapache:mainfrom
morazow:mor-2677

Conversation

@morazow
Copy link
Contributor

@morazow morazow commented Feb 24, 2026

Purpose

Linked issue: close #2677

Adds support for setting up the metrics on Helm chart values.

Brief change log

  • Added section in the helm chart values
  • Updated pods and services with metrics related configurations

Tests

Using helm unit tests

API and Format

NA

Documentation

Added usage docs to the Deploying with Helm documentation

@morazow morazow force-pushed the mor-2677 branch 2 times, most recently from 1f83844 to c42ce9a Compare March 2, 2026 12:32
Copy link
Contributor

@affo affo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thank you!
Some nits here in there, very punctual.

helm/README.md Outdated
interval: "60 SECONDS"
prometheus:
port: 9249
annotations:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should not stay in the metrics.

We should add annotations externally, e.g.:

annotations:
  pod:
    common:
    coordinator:
    tablet

to allow adding arbitrary annotations, and not only for the context of metrics 🤝
See Bitnami example: https://artifacthub.io/packages/helm/bitnami/zookeeper

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@affo thanks for the review 🙏

On the bitnami charts, they have separate annotations for metrics (under metrics.service):

metrics.service.annotations | Annotations for Prometheus to auto-discover the metrics endpoint | {}

https://github.com/bitnami/charts/blob/main/bitnami/zookeeper/values.yaml#L826-L844

But I agree, we would common and pod annotations later on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They have it in Zookeeper because they have a specific additional service for metrics.
In this PR you are not adding that service, so you would not have any specific annotations for that 🤝

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @affo, another question: at the moment for metrics we are adding the annotations in services, e.g, svc-coordinator.yaml.

Would it make sense to have:

annotations:
  service:

instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I did not notice that!
I think you are right, often times, exposing metrics through a headless service is the default way to go for solving connectivity issues and let Prometheus use the k8s DNS, hence:

  • your design is correct, annotations makes sense under metrics
  • but, you should expose the port on the service if metrics are enabled, otherwise this won't work!

Basically, we have to look at what a user would do:

Case 1: no Prometheus operator

Use annotations-based discovery.
You want to add to your service:

metadata:
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "9249"
    prometheus.io/path: "/metrics"

The user would configure its Prometheus server to scrape those with relabel rules.

Case 2: WITH Prometheus operator

No annotation is required, but you need your service to expose the metrics port with a name and make it selectable via labels! E.g.:

...
metadata:
  labels:
    monitoring: enabled
spec:
  clusterIP: None                 # headless: DNS returns pod IPs
  selector: {}
  ports:
    - name: metrics               # IMPORTANT: ServiceMonitor references this name
      port: 9100
      targetPort: 9100

You would use the ServiceMonitor CRD like this (maintained by the user):

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    release: prometheus           # IMPORTANT: must match Prometheus' CR serviceMonitorSelector
spec:
  selector:
    matchLabels:
      monitoring: enabled         # matches the Service labels above
  namespaceSelector:
    matchNames: []
  endpoints:
    - port: metrics               # matches Service port name
      path: /metrics
      interval: 30s
      scrapeTimeout: 10s

Conclusion

We should allow:

  • specific service annotations for covering case 1
  • add a named port for case 2 -> ideally the user may choose the port name
  • add the possibility to add labels to the service for selection for case 2

Probably better to have a separate metrics service in order to have a clear separation of concerns 🤝

@morazow morazow requested a review from affo March 2, 2026 18:21
helm/README.md Outdated
- configure `metrics.reporter.<name>.<option>` entries from values
- create separate metrics services exposing the Prometheus reporter port
- optionally add metrics scrape annotations to metrics services from values
- optionally add custom labels to metrics services for `ServiceMonitor` selectors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add link to Prometheus operator for reference

helm/README.md Outdated
| `metrics.service.labels` | Optional labels rendered on metrics services (usually used by ServiceMonitor selectors). |
| `metrics.annotations` | Optional map of annotations rendered on metrics services when metrics are enabled. |

Example `values.yaml` snippet:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to provide the 2 prometheus examples?

- `tablet-server-metrics-hs`
- Exposes the Prometheus reporter port as a named service port (`metrics.service.portName`)
- Adds optional labels from `metrics.service.labels` for ServiceMonitor selection

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to add the Prometheus guide here 🤝
Should explain the scraping one and the ServiceMonitor one.


metrics:
enabled: false
reporters:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we apply a similar pattern to the SASL PR, where one enables reporters based on a reporters key?

In that case, we can set sane defaults for every reporter without making that appear 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean like below:

reporters: prometheus
   prometheus:
       port: 9249
   jmx:
       port: 9250

?

If so, this would not work, in reporters one could enable both at the same time.

morazow and others added 2 commits March 6, 2026 10:21
Co-authored-by: Lorenzo Affetti <lorenzo.affetti@gmail.com>
Co-authored-by: Lorenzo Affetti <lorenzo.affetti@gmail.com>
@morazow morazow force-pushed the mor-2677 branch 2 times, most recently from c9873db to 584c536 Compare March 6, 2026 10:04
(cherry picked from commit 82ad87e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[helm] Enable metrics reporting on helm charts

2 participants