Skip to content

feat: add gRPC ToolQuery service for observability tools integration#3230

Merged
michaeljguarino merged 12 commits intomasterfrom
sebastian/prod-4431-workbench-tool-grpc-interface
Feb 20, 2026
Merged

feat: add gRPC ToolQuery service for observability tools integration#3230
michaeljguarino merged 12 commits intomasterfrom
sebastian/prod-4431-workbench-tool-grpc-interface

Conversation

@floreks
Copy link
Copy Markdown
Member

@floreks floreks commented Feb 18, 2026

This pull request introduces significant enhancements to the observability and configuration flexibility of the CloudQuery service, most notably by adding a new ToolQuery gRPC API for querying external observability tools (metrics, logs, traces), and by making the database-backed CloudQuery service optional via a new configuration flag. It also updates Helm chart templates and workflows to improve deployment and versioning. Below are the most important changes:

1. Observability: New ToolQuery gRPC API

  • Added the ToolQuery gRPC service for querying external observability tools (Prometheus, Loki, Tempo, Datadog, Elastic) with support for metrics, logs, and traces. This includes new proto definitions (toolquery.proto, timestamp.proto) and detailed API documentation with usage examples.

2. CloudQuery Service Configuration

  • Introduced a --database-enabled flag (and corresponding Helm value) to optionally disable the PostgreSQL-backed CloudQuery service. When disabled, only the ToolQuery service is registered and database health checks are skipped. Documentation updated to explain this behavior.

3. Helm Chart Improvements

  • Updated the cloud-query deployment template to conditionally include the database container and relevant environment variables based on the .Values.cloudQuery.database.enabled flag. The secret template is also now conditional on this flag.

4. CI/CD and Versioning Enhancements

  • Improved the OCI chart publishing workflow to include the workflow file itself as a trigger, and to use versioned tags (e.g., 0.0.0+pr-<number>, 0.0.0+sha-<sha>, ${{ steps.chart_version.outputs.version }}+latest) for PR and release builds.

Test Plan

Locally + using https://console.plrl-dev-aws.onplural.sh.

Checklist

  • If required, I have updated the Plural documentation accordingly.
  • I have added tests to cover my changes.
  • I have added a meaningful title and summary to convey the impact of this PR to a user.

Plural Flow: console

- Introduced ToolQuery protobuf definitions for metrics, logs, and traces queries.
- Implemented ToolQueryService to handle gRPC requests for external observability tools.
- Added new dependencies for DataDog, ElasticSearch, and Prometheus integrations.
- Updated command-line arguments and documentation for PostgreSQL-backed features.
@linear
Copy link
Copy Markdown

linear Bot commented Feb 18, 2026

@floreks floreks self-assigned this Feb 18, 2026
@floreks floreks marked this pull request as draft February 18, 2026 11:22
@floreks floreks added the enhancement New feature or request label Feb 18, 2026
@socket-security
Copy link
Copy Markdown

socket-security Bot commented Feb 18, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedresty.dev/​v3@​v3.0.0-beta.671100100100100
Addedgithub.com/​DataDog/​datadog-api-client-go/​v2@​v2.54.073100100100100
Addedgithub.com/​elastic/​go-elasticsearch/​v9@​v9.3.17610010010060

View full report

@socket-security
Copy link
Copy Markdown

socket-security Bot commented Feb 18, 2026

Warning

Review the following alerts detected in dependencies.

According to your organization's Security Policy, it is recommended to resolve "Warn" alerts. Learn more about Socket for GitHub.

Action Severity Alert  (click "▶" to expand/collapse)
Warn High
Obfuscated code: golang github.com/elastic/go-elasticsearch/v9 is 98.0% likely obfuscated

Confidence: 0.98

Location: Package overview

From: go/cloud-query/go.modgolang/github.com/elastic/go-elasticsearch/v9@v9.3.1

ℹ Read more on: This package | This alert | What is obfuscated code?

Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support@socket.dev.

Suggestion: Packages should not obfuscate their code. Consider not using packages with obfuscated code.

Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment @SocketSecurity ignore golang/github.com/elastic/go-elasticsearch/v9@v9.3.1. You can also ignore all packages with @SocketSecurity ignore-all. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

Warn High
Obfuscated code: golang github.com/elastic/go-elasticsearch/v9 is 98.0% likely obfuscated

Confidence: 0.98

Location: Package overview

From: go/cloud-query/go.modgolang/github.com/elastic/go-elasticsearch/v9@v9.3.1

ℹ Read more on: This package | This alert | What is obfuscated code?

Next steps: Take a moment to review the security alert above. Review the linked package source code to understand the potential risk. Ensure the package is not malicious before proceeding. If you're unsure how to proceed, reach out to your security team or ask the Socket team for help at support@socket.dev.

Suggestion: Packages should not obfuscate their code. Consider not using packages with obfuscated code.

Mark the package as acceptable risk. To ignore this alert only in this pull request, reply with the comment @SocketSecurity ignore golang/github.com/elastic/go-elasticsearch/v9@v9.3.1. You can also ignore all packages with @SocketSecurity ignore-all. To ignore an alert for all future pull requests, use Socket's Dashboard to change the triage state of this alert.

View full report

- Included the workflow file path in PR triggers.
- Modified the version tagging format to follow semver conventions for PRs and commits.
- Ensured tags are consistent for `latest` releases.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 18, 2026

Greptile Summary

This PR adds a new gRPC ToolQuery service for querying external observability tools (Prometheus, Loki, Tempo, Datadog, Elasticsearch) and makes the database-backed CloudQuery service optional via configuration.

Key changes:

  • New ToolQuery gRPC API with support for metrics, logs, and traces queries across multiple providers
  • All previously reported issues have been fixed: Datadog authentication context is properly returned and used, Loki timestamps use UnixNano(), traces filter uses correct GetEnd(), and Elasticsearch errors are propagated
  • --database-enabled flag allows running only the ToolQuery service without PostgreSQL dependency
  • Helm chart conditionally deploys database container based on cloudQuery.database.enabled flag
  • CI workflow improved with self-triggering and versioned PR tags

Minor observation:

  • LokiConnection in the protobuf skips field number 2, jumping from 1 to 3. While valid, consider adding reserved 2; if this was intentional to prevent future reuse.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • All critical bugs from previous review rounds have been fixed. The implementation follows best practices with proper error handling, input validation, and context management. The database-optional architecture is cleanly implemented with appropriate feature flags.
  • No files require special attention

Important Files Changed

Filename Overview
go/cloud-query/internal/tools/provider_datadog.go New Datadog provider implementation with proper context handling for authentication and correct time range configuration for all query types
go/cloud-query/internal/tools/provider_loki.go Loki provider correctly uses UnixNano() for timestamp conversion to nanosecond-precision Unix timestamps
go/cloud-query/internal/tools/provider.go Provider factory correctly propagates Elasticsearch initialization errors to callers
go/cloud-query/api/proto/toolquery.proto Clean protobuf definitions for ToolQuery gRPC service with minor field numbering gap in LokiConnection (field 2 skipped)
go/cloud-query/cmd/main.go Added conditional service registration based on database-enabled flag, ToolQuery service always registered
go/cloud-query/internal/service/toolquery.go ToolQuery gRPC service with proper input validation and error mapping
charts/console/templates/cloud-query/deployment.yaml Deployment template conditionally includes database container and environment variables based on database.enabled flag

Last reviewed commit: 6d0bd4b

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

25 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment thread go/cloud-query/internal/tools/provider_datadog.go Outdated
Comment thread go/cloud-query/internal/tools/provider_loki.go Outdated
Comment thread go/cloud-query/internal/tools/provider_datadog.go Outdated
Comment thread go/cloud-query/internal/tools/provider.go Outdated
Comment thread go/cloud-query/api/proto/toolquery.proto
- Changed version tags for PRs from `0.0.0-pr.<number>` to `0.0.0+pr-<number>`.
- Adjusted SHA-based tags in the same format for consistency.
- Deleted `helpers.go` to remove redundant utilities for time parsing and URL manipulation.
- Enhanced loki provider by using UnixNano for logs interval.
- Improved provider initializations with error handling.
- Adjusted Datadog client creation to pass context alongside API client.
- Simplified step duration parsing in Prometheus provider.
- Updated proto for adjusted field numbers in connections.
- Removed unnecessary time range parsing in Tempo provider.
- Introduced `GetPluralEnvBool` for boolean environment variable management.
- Updated command-line argument `database-enabled` to utilize the new environment handling function.
- Simplified Kubernetes deployment YAML by improving indentation consistency.
- Introduced `validateInput` and `validateTimeRange` functions to ensure inputs are not nil and have valid values.
- Enhanced validation for connection, query, and time range parameters in `Metrics`, `Logs`, and `Traces` methods.
- Improved handling of tags in `provider_datadog.go` by deriving labels from series scope when missing.
…uest size handling

- Removed redundant limit determination logic in Elasticsearch provider.
- Simplified query string query with more concise construction.
- Adjusted request size setup to follow query construction.
- Added `prometheus.go` for Prometheus HTTP client creation with authentication support.
- Introduced `elastic.go` for transforming Elasticsearch source data into log entries.
- Updated `provider_elastic.go` to integrate `ElasticSource` for improved data handling.
- Refactored `provider_prometheus.go` to utilize new Prometheus HTTP client and removed `resty` dependency.
… integration

- Moved `TempoTraceResponse` struct to `datasource` package for improved modularity.
- Updated `TempoClient` to use `datasource.TempoTraceResponse`.
- Refactored `provider_tempo.go` to leverage new `TraceResponse` handling with reduced complexity.
- Removed `OTLPBatch` struct in favor of using `OTLPResourceSpans` directly.
- Streamlined `TempoTraceResponse` construction by appending `Batches` to `ResourceSpans`.
- Simplified data structure by removing redundant batch processing logic.
- Changed `site` to optional and `appKey` to required in `DatadogConnection` definitions.
- Updated API and client connection details.
- Enhanced README and reference documentation with compatibility matrix and detailed client and endpoint information for ToolQuery integrations.
- Refined command-line arguments and environment variables documentation for Cloud-Query configuration.
@floreks floreks marked this pull request as ready for review February 20, 2026 16:03
@floreks
Copy link
Copy Markdown
Member Author

floreks commented Feb 20, 2026

@greptileai

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30 files reviewed, 5 comments

Edit Code Review Agent Settings | Greptile

Comment thread go/cloud-query/internal/tools/provider_datadog.go
Comment thread go/cloud-query/internal/tools/provider_loki.go
Comment thread go/cloud-query/internal/tools/provider.go
Comment thread go/cloud-query/internal/tools/provider_datadog.go
Comment thread go/cloud-query/api/proto/toolquery.proto
@floreks
Copy link
Copy Markdown
Member Author

floreks commented Feb 20, 2026

@greptileai

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment thread go/cloud-query/api/proto/toolquery.proto
repeated LogEntry logs = 1;
}

message TraceSpan {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't feel like this is intuitively the right datastructure, what does tempo use (or opentelemetry)?

@michaeljguarino michaeljguarino merged commit 2971383 into master Feb 20, 2026
28 checks passed
@michaeljguarino michaeljguarino deleted the sebastian/prod-4431-workbench-tool-grpc-interface branch February 20, 2026 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants