Skip to content

Conversation

@somiljain2006
Copy link

What is the purpose of the change?

This change prevents the creation of multiple physical Nacos NamingService connections when a single Dubbo application uses multiple registry groups pointing to the same Nacos server. It introduces a reference-counted cache that reuses a single NamingService instance per unique Nacos connection identity, ensuring connections are only closed when no registries remain.

Fixes #15624

Checklist

  • Make sure there is a GitHub_issue field for the change.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Write necessary unit tests to verify your logic correction. If the new feature or significant change is committed, please remember to add a sample in the dubbo samples project.
  • Make sure GitHub actions can pass. Why the workflow is failing and how to fix it?

@codecov-commenter
Copy link

codecov-commenter commented Dec 21, 2025

Codecov Report

❌ Patch coverage is 71.23288% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.75%. Comparing base (f7ff60e) to head (e466f24).

Files with missing lines Patch % Lines
...o/registry/nacos/util/NacosNamingServiceUtils.java 73.07% 10 Missing and 4 partials ⚠️
...org/apache/dubbo/registry/nacos/NacosRegistry.java 70.00% 6 Missing ⚠️
...he/dubbo/registry/nacos/NacosServiceDiscovery.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##                3.3   #15891   +/-   ##
=========================================
  Coverage     60.75%   60.75%           
- Complexity    11710    11720   +10     
=========================================
  Files          1938     1938           
  Lines         88694    88744   +50     
  Branches      13387    13390    +3     
=========================================
+ Hits          53882    53917   +35     
- Misses        29288    29297    +9     
- Partials       5524     5530    +6     
Flag Coverage Δ
integration-tests-java21 32.43% <64.38%> (+0.12%) ⬆️
integration-tests-java8 32.44% <64.38%> (+0.05%) ⬆️
samples-tests-java21 32.06% <61.64%> (+0.05%) ⬆️
samples-tests-java8 29.72% <61.64%> (+0.10%) ⬆️
unit-tests-java11 59.06% <65.75%> (+0.01%) ⬆️
unit-tests-java17 58.54% <65.75%> (+<0.01%) ⬆️
unit-tests-java21 58.53% <65.75%> (+<0.01%) ⬆️
unit-tests-java25 58.50% <65.75%> (+0.01%) ⬆️
unit-tests-java8 59.04% <66.66%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a reference-counted caching mechanism to prevent duplicate Nacos NamingService connections when multiple Dubbo registry groups point to the same Nacos server. The implementation ensures that a single physical connection is shared across registry groups with different group parameters, and connections are only closed when all references are released.

Key changes:

  • Introduced NacosNamingServiceHolder with atomic reference counting to track shared NamingService wrapper usage
  • Added URL normalization to create consistent cache keys by removing group parameters and sorting server addresses
  • Implemented releaseNamingService() method for proper connection lifecycle management with reference counting

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
NacosNamingServiceUtils.java Core implementation: added connection caching with reference counting, URL normalization for cache key generation, and release mechanism
NacosRegistry.java Updated destroy() method to call releaseNamingService() instead of directly shutting down the wrapper; added cleanup for originToAggregateListener
NacosNamingServiceUtilsTest.java Added @AfterEach cleanup to clear cache between tests; removed unnecessary throws clause from testRetryCreate(); added explicit cache clear in testRetryCreate()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@RainYuY
Copy link
Member

RainYuY commented Dec 22, 2025

Hmm. I’m wondering if we just remove the call to addParameter(CONFIG_NAMESPACE_KEY) from org.apache.dubbo.registry.nacos.NacosRegistryFactory#createRegistryCacheKey, will we get the same result?

@somiljain2006
Copy link
Author

somiljain2006 commented Dec 22, 2025

No, that would cause different namespaces to collapse into the same cache key. Also, relying solely on RegistryFactory caching is unsafe when multiple services share a Registry instance; one service calling destroy() would physically kill the connection for everyone else. We need the reference counting to manage that lifecycle.

@RainYuY
Copy link
Member

RainYuY commented Dec 22, 2025

No, that would cause different namespaces to collapse into the same cache key. Also, relying solely on RegistryFactory caching is unsafe when multiple services share a Registry instance; one service calling destroy() would physically kill the connection for everyone else. We need the reference counting to manage that lifecycle.

Regarding the first point, I don’t know what problems the same cache key will cause. In fact, I think the same cache key should yield the correct result—maybe there’s some reason behind the special logic. However, I don’t think the second scenario exists. Because as for the registry, I believe we only destroy it when destroying all components; nor do I think there’s any scenario that requires destroying a nacos group. Moreover, the logic should be that one Nacos server corresponds to one registry instance.

@RainYuY
Copy link
Member

RainYuY commented Dec 22, 2025

No, that would cause different namespaces to collapse into the same cache key. Also, relying solely on RegistryFactory caching is unsafe when multiple services share a Registry instance; one service calling destroy() would physically kill the connection for everyone else. We need the reference counting to manage that lifecycle.

Regarding the first point, I don’t know what problems the same cache key will cause. In fact, I think the same cache key should yield the correct result—maybe there’s some reason behind the special logic. However, I don’t think the second scenario exists. Because as for the registry, I believe we only destroy it when destroying all components; nor do I think there’s any scenario that requires destroying a nacos group. Moreover, the logic should be that one Nacos server corresponds to one registry instance.

I don’t mean I’m correct. But I think we must find out why Nacos has this special group-related logic, and what scenarios would trigger the destroy logic.

@somiljain2006
Copy link
Author

That makes sense. I agree we should understand the intent behind Nacos’s group-related logic and the lifecycle assumptions more clearly.

From what I’ve seen, Nacos treats groups as a naming-level concern, while the NamingService itself is bound to a specific server + namespace. That’s why the Nacos API exposes a group on registration/subscription methods rather than requiring a separate NamingService per group.

On the destroy side, the primary scenario guards against is the shutdown race condition. When multiple registry wrappers share a single NamingService, they are destroyed in sequence during application shutdown. Without reference counting, the first registry to close would call shutdown() on the shared connection, instantly killing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants