Skip to content

feat: add Consul-based service discovery and leader election contrib extension#18843

Merged
FrankChen021 merged 7 commits intoapache:masterfrom
santosh-d3vpl3x:master
Mar 30, 2026
Merged

feat: add Consul-based service discovery and leader election contrib extension#18843
FrankChen021 merged 7 commits intoapache:masterfrom
santosh-d3vpl3x:master

Conversation

@santosh-d3vpl3x
Copy link
Copy Markdown
Contributor

@santosh-d3vpl3x santosh-d3vpl3x commented Dec 13, 2025

Fixes #18746.

Description

Add contrib extension druid-consul-extensions to let Druid use Consul for service discovery and Coordinator/Overlord leader election as
an alternative to ZooKeeper, using the HTTP server view and indexer runner.

Consul-based discovery and leader election

  • Wire ConsulDiscoveryModule to provide Consul-backed announcer, discovery provider, and leader selector when the extension is enabled.
  • Implement ConsulDruidNodeAnnouncer, ConsulDruidNodeDiscoveryProvider, and ConsulLeaderSelector on top of ConsulApiClient /
    DefaultConsulApiClient, with ConsulServiceIds, ConsulDiscoveryConfig, and ConsulSSLConfig for configuration.
  • Add ConsulMetrics, retry/backoff handling, and clear logging around registration, watches, and leader lock lifecycle.

Tests and documentation

  • Add unit tests for config, announcer, leader selector, client security, and service ID behavior.
  • Add embedded/docker Consul discovery tests covering plain, TLS, and mTLS, including restart/failover.
  • Document the extension, configuration examples, a basic Consul dev harness, and a high-level migration path from ZooKeeper.

Release note

Add optional contrib extension druid-consul-extensions that lets Druid clusters use Consul for service discovery and
Coordinator/Overlord leader election instead of ZooKeeper, with support for ACLs, TLS/mTLS, and metrics. This requires
druid.serverview.type=http and druid.indexer.runner.type=httpRemote to be enabled cluster-wide before switching selectors to Consul.


Key changed/added classes in this PR
  • extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulDiscoveryModule
  • extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulDiscoveryConfig
  • extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulDruidNodeAnnouncer
  • extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulDruidNodeDiscoveryProvider
  • extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulLeaderSelector
  • extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulApiClient
  • embedded-tests/src/test/java/org/apache/druid/testing/embedded/docker/BaseConsulDiscoveryDockerTest and Consul Docker test subclasses

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in
    licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code
    coverage
    is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new contrib extension enabling Apache Druid to use Consul for service discovery and Coordinator/Overlord leader election as an alternative to ZooKeeper.

Changes:

  • Introduces extensions-contrib/consul-extensions with Consul-backed announcer, discovery provider, leader selector, client, metrics, and configuration.
  • Adds unit tests plus Docker-based embedded integration tests for plain HTTP, TLS, and mTLS Consul modes.
  • Updates build/distribution wiring, licensing metadata, docs, and website spelling whitelist for the new extension.

Reviewed changes

Copilot reviewed 41 out of 41 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
website/.spelling Adds new Consul/TLS-related terms used in docs.
services/src/test/java/org/apache/druid/testing/utils/TLSCertificateGenerator.java Generates test TLS/mTLS certs/keystores for embedded Consul tests.
services/src/test/java/org/apache/druid/testing/utils/TLSCertificateBundle.java Wraps generated cert/keystore paths and cleanup logic for tests.
services/pom.xml Adds Bouncy Castle test dependencies used by TLS generator.
pom.xml Adds consul-extensions module; adds bcprov dep mgmt; updates EasyMock version.
licenses.yaml Fixes module path for Kubernetes entries; adds Consul API library license entry.
extensions-contrib/consul-extensions/src/test/resources/tls/README.md Notes TLS integration tests moved to embedded-tests and how to run them.
extensions-contrib/consul-extensions/src/test/resources/tls/.gitignore Ignores generated TLS artifacts in test resources.
extensions-contrib/consul-extensions/src/test/java/org/apache/druid/consul/discovery/TestUtils.java Test builder utilities for Consul discovery config.
extensions-contrib/consul-extensions/src/test/java/org/apache/druid/consul/discovery/ConsulServiceIdsTest.java Unit tests for Consul service naming/id/KV key derivation.
extensions-contrib/consul-extensions/src/test/java/org/apache/druid/consul/discovery/ConsulLeaderSelectorTest.java Unit tests for leader selector behavior using mocked Consul client.
extensions-contrib/consul-extensions/src/test/java/org/apache/druid/consul/discovery/ConsulDruidNodeDiscoveryProviderTest.java Unit tests for discovery provider caching and listener notifications.
extensions-contrib/consul-extensions/src/test/java/org/apache/druid/consul/discovery/ConsulDruidNodeAnnouncerTest.java Unit tests for registering/deregistering nodes and retry/cleanup logic.
extensions-contrib/consul-extensions/src/test/java/org/apache/druid/consul/discovery/ConsulDiscoveryConfigTest.java Unit tests for config defaults, serde, validation, and masking.
extensions-contrib/consul-extensions/src/test/java/org/apache/druid/consul/discovery/ConsulClientsSecurityTest.java Tests fail-fast behavior for basic auth over HTTP without TLS.
extensions-contrib/consul-extensions/src/main/resources/META-INF/services/org.apache.druid.initialization.DruidModule Registers the new Druid module for extension loading.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/DefaultConsulApiClient.java Implements Consul API interactions for registration, discovery, and watching.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulServiceIds.java Centralizes service name/id and KV key formats.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulSSLConfig.java Defines Consul-specific SSL client configuration.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulMetrics.java Utility for emitting lightweight Consul-related metrics.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulLeaderSelector.java Implements Consul-backed leader election via sessions and KV locks.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulDruidNodeDiscoveryProvider.java Implements Consul-backed DruidNodeDiscoveryProvider with blocking watches.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulDruidNodeAnnouncer.java Announces nodes to Consul and maintains TTL health checks.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulDiscoveryModule.java Wires up Guice bindings and polybind selectors for Consul mode.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulDiscoveryConfig.java Adds structured configuration model and cross-field validation.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulClients.java Builds ConsulClient with optional TLS and basic auth injection.
extensions-contrib/consul-extensions/src/main/java/org/apache/druid/consul/discovery/ConsulApiClient.java Declares Consul API client abstraction used by components.
extensions-contrib/consul-extensions/pom.xml Defines Maven module for the new extension and test deps.
extensions-contrib/consul-extensions/README.md Adds extension usage, build, and test documentation.
extensions-contrib/consul-extensions/.gitignore Ignores local docker/test runtime artifacts for the new module.
embedded-tests/src/test/resources/tls/consul-config-tls-only.json Consul container TLS-only config for embedded tests.
embedded-tests/src/test/resources/tls/consul-config-mtls.json Consul container mTLS config for embedded tests.
embedded-tests/src/test/java/org/apache/druid/testing/embedded/docker/ConsulDiscoveryTLSDockerTest.java TLS-mode embedded test variant.
embedded-tests/src/test/java/org/apache/druid/testing/embedded/docker/ConsulDiscoveryPlainDockerTest.java Plain HTTP embedded test variant.
embedded-tests/src/test/java/org/apache/druid/testing/embedded/docker/ConsulDiscoveryMTLSDockerTest.java mTLS-mode embedded test variant.
embedded-tests/src/test/java/org/apache/druid/testing/embedded/docker/BaseConsulDiscoveryDockerTest.java Shared embedded test wiring and Consul KV verification.
embedded-tests/src/test/java/org/apache/druid/testing/embedded/consul/ConsulSecurityMode.java Enum of embedded Consul security modes.
embedded-tests/src/test/java/org/apache/druid/testing/embedded/consul/ConsulClusterResource.java Testcontainers resource to run Consul with plain/TLS/mTLS configuration.
embedded-tests/pom.xml Adds consul extension as a test dependency; sets docker-test JVM args.
docs/development/extensions-contrib/consul.md Adds full extension documentation, config, security, and ops guidance.
distribution/pom.xml Packages the new contrib extension into distributions.
Comments suppressed due to low confidence (3)

services/src/test/java/org/apache/druid/testing/utils/TLSCertificateGenerator.java:1

  • The helper ultimately marks all generated files world-readable (including private keys). Even for tests, this is a risky default on shared CI/dev machines. Consider applying stricter permissions for key material (e.g., 0600 or 0640) while keeping certs/truststores world-readable as needed for container mounts; alternatively, make the “relax permissions for containers” behavior opt-in.
    pom.xml:1
  • EasyMock 5.4.0 is not a version I can confirm exists (as of my knowledge cutoff). Please verify it is available in Maven Central for the project’s build tooling; if not, use an existing released version (or keep 5.2.0) to avoid build resolution failures.
    website/.spelling:1
  • There are duplicate entries in the spelling whitelist, which adds noise and makes future maintenance harder. Deduplicate repeated tokens (e.g., druid.discovery.consul.tlsCaCertPath, druid.discovery.consul.enableTls) to keep the list clean.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kfaraz
Copy link
Copy Markdown
Contributor

kfaraz commented Feb 4, 2026

Thanks for creating this PR, @santosh-d3vpl3x !
I think it would be nice to have an alternative leader election mechanism other than ZK.

By the way, have you had a chance to explore the K8s based leader election in Druid? Maybe you could give that a shot too.

We will try to get this PR reviewed soon, thanks for your patience!

@santosh-d3vpl3x
Copy link
Copy Markdown
Contributor Author

Thanks @kfaraz for getting back! I have tried it in the past. The druid cluster we are thinking to run will span multiple k8s clusters in multiple AZs, thus, I wanted to have a mechanism such as consul for doing discovery and election.

@FrankChen021
Copy link
Copy Markdown
Member

Thanks @kfaraz for getting back! I have tried it in the past. The druid cluster we are thinking to run will span multiple k8s clusters in multiple AZs, thus, I wanted to have a mechanism such as consul for doing discovery and election.

That's true. k8s-based discovery limits the deployment of a druid cluster in one k8s cluster.

- Fix leaderSessionTtl computation: track explicit user setting via flag
  to correctly recompute TTL from healthCheckInterval when omitted
- Change leaderMaxErrorRetries default to unlimited (Long.MAX_VALUE) for
  null/0/negative values since giving up breaks cluster operation
- Add null session guard in leader election loop with backoff to prevent
  tight retry loops when session creation fails
- Validate watchSeconds >= 1 second to prevent non-blocking query loops
  caused by Duration.getStandardSeconds() truncation
- Fix metadata size check to use UTF-8 byte length instead of char count
  for correct Consul 512-byte limit enforcement
- Add null check for announcedNodes during re-registration to handle
  concurrent unannouncement during shutdown
- Update docs to reflect unlimited retry default behavior
bcpkix-jdk18on:1.81 requires bcprov-jdk18on:[1.81,1.82) per its POM.
This was missed in apache#18888 which updated bcpkix but not bcprov,
causing license check failures.
@santosh-d3vpl3x santosh-d3vpl3x changed the title Add Consul-based service discovery and leader election contrib extension feat: add Consul-based service discovery and leader election contrib extension Mar 21, 2026
@santosh-d3vpl3x
Copy link
Copy Markdown
Contributor Author

@FrankChen021 @kfaraz could you please have a look and help this PR proceed ahead..

@FrankChen021 FrankChen021 merged commit 01467e1 into apache:master Mar 30, 2026
38 of 39 checks passed
@github-actions github-actions Bot added this to the 37.0.0 milestone Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Add Consul-based service discovery and leader election (as a contrib extension)

5 participants