Skip to content

feat(storage): implement App-centric Observability (ACO) OpenTelemetry tracing#17149

Draft
chandra-siri wants to merge 1 commit into
googleapis:mainfrom
chandra-siri:feat/gcs-aco-tracing
Draft

feat(storage): implement App-centric Observability (ACO) OpenTelemetry tracing#17149
chandra-siri wants to merge 1 commit into
googleapis:mainfrom
chandra-siri:feat/gcs-aco-tracing

Conversation

@chandra-siri
Copy link
Copy Markdown
Contributor

Inject mandatory OpenTelemetry span attributes (gcp.resource.destination.id and gcp.resource.destination.location) into Cloud Storage bucket and object spans to support App-centric Observability (ACO). Implements a highly optimized, lockless in-memory LRU cache on the GCS Client, backed by asynchronous singleflight background fetching to prevent cache miss stampedes and eliminate latency overhead.

@chandra-siri chandra-siri requested a review from a team as a code owner May 15, 2026 09:02
@chandra-siri chandra-siri force-pushed the feat/gcs-aco-tracing branch from 56e225f to cab26e6 Compare May 15, 2026 09:05
@chandra-siri chandra-siri marked this pull request as draft May 15, 2026 09:11
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an in-memory LRU cache for GCS bucket metadata to support App-centric Observability (ACO) within OpenTelemetry tracing. Key changes include the implementation of a thread-safe BucketMetadataCache with background fetching, updates to the Client class to support context management and cache lifecycle, and the re-enabling of system tests in Cloud Build. Feedback identifies a potential StopIteration error in the cache logic if max_size is zero and suggests pre-compiling regular expressions in the tracing module to improve performance.

Comment on lines +108 to +116
"""Thread-safely update or insert a cache entry with bounded size."""
with self._lock:
if bucket_name in self._cache:
self._cache.pop(bucket_name)
elif len(self._cache) >= self._max_size:
# Pop the first item (oldest inserted)
first_key = next(iter(self._cache))
self._cache.pop(first_key)
self._cache[bucket_name] = (destination_id, location)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of update_cache will raise a StopIteration exception if max_size is set to 0, as next(iter(self._cache)) will be called on an empty dictionary. While the default is 10000, it is safer to validate that max_size is at least 1 in the constructor or handle the zero-size case here.

Suggested change
"""Thread-safely update or insert a cache entry with bounded size."""
with self._lock:
if bucket_name in self._cache:
self._cache.pop(bucket_name)
elif len(self._cache) >= self._max_size:
# Pop the first item (oldest inserted)
first_key = next(iter(self._cache))
self._cache.pop(first_key)
self._cache[bucket_name] = (destination_id, location)
def update_cache(self, bucket_name, destination_id, location):
"""Thread-safely update or insert a cache entry with bounded size."""
if self._max_size <= 0:
return
with self._lock:
if bucket_name in self._cache:
self._cache.pop(bucket_name)
elif len(self._cache) >= self._max_size:
# Pop the first item (oldest inserted)
first_key = next(iter(self._cache))
self._cache.pop(first_key)
self._cache[bucket_name] = (destination_id, location)

Comment on lines +80 to +91
def _extract_bucket_name(api_request=None, attributes=None):
path = None
if api_request and "path" in api_request:
path = api_request.get("path")
elif attributes and "url.path" in attributes:
path = attributes.get("url.path")

if path:
match = re.search(r"/b/([^/?#]+)", path)
if match:
return match.group(1)
return None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regular expression used to extract the bucket name is compiled on every function call. Pre-compiling this regex as a module-level constant will improve performance. Additionally, ensure that null checks for optional parameters like api_request and attributes are handled within the function body to improve encapsulation.

Suggested change
def _extract_bucket_name(api_request=None, attributes=None):
path = None
if api_request and "path" in api_request:
path = api_request.get("path")
elif attributes and "url.path" in attributes:
path = attributes.get("url.path")
if path:
match = re.search(r"/b/([^/?#]+)", path)
if match:
return match.group(1)
return None
_BUCKET_NAME_RE = re.compile(r"/b/([^/?#]+)")
def _extract_bucket_name(api_request=None, attributes=None):
path = None
if api_request and "path" in api_request:
path = api_request.get("path")
elif attributes and "url.path" in attributes:
path = attributes.get("url.path")
if path:
match = _BUCKET_NAME_RE.search(path)
if match:
return match.group(1)
return None
References
  1. When processing optional parameters with a helper function, prefer placing the null or None check within the function body rather than at the call site to simplify the calling code and improve encapsulation.

@chandra-siri chandra-siri force-pushed the feat/gcs-aco-tracing branch 3 times, most recently from 5544ee4 to d037656 Compare May 15, 2026 14:08
…y tracing and lockless bucket metadata caching
@chandra-siri chandra-siri force-pushed the feat/gcs-aco-tracing branch from d037656 to d0f4980 Compare May 15, 2026 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant