From 63a7b653e3dc9260f48a48a8b43e7c9d1b8c3544 Mon Sep 17 00:00:00 2001 From: paulj Date: Mon, 2 Feb 2026 11:43:53 -0500 Subject: [PATCH] docs(explore): Add practical "what to do next" guides for logs, metrics, and traces Add four new documentation pages that bridge the gap between SDK setup and getting value from Sentry's observability features: - what-to-log.mdx: High-value logging patterns with search and alert examples - what-to-track.mdx: Metric patterns for counters, gauges, and distributions - querying-traces.mdx: How to query auto-instrumented trace data - custom-spans.mdx: When and how to add custom instrumentation Each guide follows a consistent "5 things to add" structure with code examples, query patterns, and alert recommendations. Co-Authored-By: Claude --- docs/product/explore/logs/what-to-log.mdx | 205 ++++++++++++++++++ .../product/explore/metrics/what-to-track.mdx | 154 +++++++++++++ .../explore/trace-explorer/custom-spans.mdx | 170 +++++++++++++++ .../trace-explorer/querying-traces.mdx | 96 ++++++++ 4 files changed, 625 insertions(+) create mode 100644 docs/product/explore/logs/what-to-log.mdx create mode 100644 docs/product/explore/metrics/what-to-track.mdx create mode 100644 docs/product/explore/trace-explorer/custom-spans.mdx create mode 100644 docs/product/explore/trace-explorer/querying-traces.mdx diff --git a/docs/product/explore/logs/what-to-log.mdx b/docs/product/explore/logs/what-to-log.mdx new file mode 100644 index 0000000000000..ee5af404cca49 --- /dev/null +++ b/docs/product/explore/logs/what-to-log.mdx @@ -0,0 +1,205 @@ +--- +title: "What to Log" +sidebar_order: 5 +description: "Practical guidance on what to log, how to search logs, and when to set alerts." +--- + +You've set up Sentry Logs. Now what? This guide covers the high-value logging patterns that help you debug faster and catch problems before users report them. + +## The Pattern + +Every structured log follows the same format: + +```javascript +Sentry.logger.(message, { attributes }); +``` + +**Levels:** `trace`, `debug`, `info`, `warn`, `error`, `fatal` + +**Attributes:** Key-value pairs you can search and filter on. Use whatever naming convention fits your codebase—consistency matters more than specific names. + +```javascript +Sentry.logger.info("Order completed", { + orderId: "order_123", + userId: user.id, + amount: 149.99, + paymentMethod: "stripe" +}); +``` + +Every log is automatically trace-connected. Click any log entry to see the full trace, spans, and errors from that moment. + +## Where to Add Logs + +These five categories give you the most debugging value per line of code. + +### 1. Authentication Events + +Login flows are invisible until something breaks. Log successes and failures to spot patterns—brute force attempts, OAuth misconfigurations, or MFA issues. + +```javascript +Sentry.logger.info("User logged in", { + userId: user.id, + authMethod: "oauth", + provider: "google" +}); + +Sentry.logger.warn("Login failed", { + email: maskedEmail, + reason: "invalid_password", + attemptCount: 3 +}); +``` + +**Search:** `userId:123 "logged in"` or `severity:warn authMethod:*` + +**Alert idea:** `severity:warn "Login failed"` exceeding your baseline in 5 minutes can indicate brute force or auth provider issues. + +### 2. Payment and Checkout + +Money paths need visibility even when they succeed. When payments fail, you need context fast. + +```javascript +Sentry.logger.error("Payment failed", { + orderId: "order_123", + amount: 99.99, + gateway: "stripe", + errorCode: "card_declined", + cartItems: 3 +}); +``` + +**Search:** `orderId:order_123` or `severity:error gateway:stripe` + +**Alert idea:** `severity:error gateway:*` spiking can indicate payment provider outages. + +### 3. External APIs and Async Operations + +Traces capture what your code does. Logs capture context about external triggers and async boundaries—webhooks, scheduled tasks, third-party API responses—that traces can't automatically instrument. + +```javascript +// Third-party API call +const start = Date.now(); +const response = await shippingApi.getRates(items); + +Sentry.logger.info("Shipping rates fetched", { + service: "shipping-provider", + endpoint: "/rates", + durationMs: Date.now() - start, + rateCount: response.rates.length +}); + +// Webhook received +Sentry.logger.info("Webhook received", { + source: "stripe", + eventType: "payment_intent.succeeded", + paymentId: event.data.object.id +}); +``` + +**Search:** `service:shipping-provider durationMs:>2000` or `source:stripe` + +**Alert idea:** `service:* durationMs:>3000` can catch third-party slowdowns before they cascade. + +### 4. Background Jobs + +Jobs run outside request context. Without logs, failed jobs are invisible until someone notices missing data. + +```javascript +Sentry.logger.info("Job started", { + jobType: "email-digest", + jobId: "job_456", + queue: "notifications" +}); + +Sentry.logger.error("Job failed", { + jobType: "email-digest", + jobId: "job_456", + retryCount: 3, + lastError: "SMTP timeout" +}); +``` + +**Search:** `jobType:email-digest severity:error` + +**Alert idea:** `severity:error jobType:*` spiking can indicate queue processing issues or downstream failures. + +### 5. Feature Flags and Config Changes + +When something breaks after a deploy, the first question is "what changed?" Logging flag evaluations and config reloads gives you that answer instantly. + +```javascript +Sentry.logger.info("Feature flag evaluated", { + flag: "new-checkout-flow", + enabled: true, + userId: user.id +}); + +Sentry.logger.warn("Config reloaded", { + reason: "env-change", + changedKeys: ["API_TIMEOUT", "MAX_CONNECTIONS"] +}); +``` + +**Search:** `flag:new-checkout-flow` or `"Config reloaded"` + +## Creating Alerts From Logs + +1. Go to **Explore > Logs** +2. Enter your search query (e.g., `severity:error gateway:*`) +3. Click **Save As** → **Alert** +4. Choose a threshold type: + - **Static:** Alert when count exceeds a value + - **Percent Change:** Alert when count changes relative to a previous period + - **Anomaly:** Let Sentry detect unusual patterns +5. Configure notification channels and save + +## Production Logging Strategy + +Local debugging often means many small logs tracing execution flow. In production, this creates noise that's hard to query. + +Instead, log fewer messages with higher cardinality. Store events during execution and emit them as a single structured log. + +**Don't do this:** + +```javascript +Sentry.logger.info("Checkout started", { userId: "882" }); +Sentry.logger.info("Discount applied", { code: "WINTER20" }); +Sentry.logger.error("Payment failed", { reason: "Insufficient Funds" }); +``` + +These logs are trace-connected, but searching for the error won't return the userId or discount code from the same transaction. + +**Do this instead:** + +```javascript +Sentry.logger.error("Checkout failed", { + userId: "882", + orderId: "order_pc_991", + cartTotal: 142.50, + discountCode: "WINTER20", + paymentMethod: "stripe", + errorReason: "Insufficient Funds", + itemCount: 4 +}); +``` + +One log tells the whole story. Search for the error and get full context. + +## Log Drains for Platform Logs + +If you can't install the Sentry SDK or need platform-level logs (CDN, database, load balancer), use [Log Drains](/product/drains/). + +**Platform drains:** Vercel, Cloudflare Workers, Heroku, Supabase + +**Forwarders:** OpenTelemetry Collector, Vector, Fluent Bit, AWS CloudWatch, Kafka + +## Quick Reference + +| Category | Level | Key Attributes | +|----------|-------|----------------| +| Auth events | `info`/`warn` | userId, authMethod, reason | +| Payments | `info`/`error` | orderId, amount, gateway, errorCode | +| External APIs | `info` | service, endpoint, durationMs | +| Background jobs | `info`/`error` | jobType, jobId, retryCount | +| Feature flags | `info` | flag, enabled, changedKeys | diff --git a/docs/product/explore/metrics/what-to-track.mdx b/docs/product/explore/metrics/what-to-track.mdx new file mode 100644 index 0000000000000..e1b9a49c69820 --- /dev/null +++ b/docs/product/explore/metrics/what-to-track.mdx @@ -0,0 +1,154 @@ +--- +title: "What to Track" +sidebar_order: 5 +description: "Practical guidance on what metrics to track and how to explore them in Sentry." +--- + +You've set up Sentry Metrics. Now what? This guide covers the high-value metric patterns that give you visibility into application health—and how to drill into traces when something looks off. + +## The Pattern + +Sentry supports three metric types: + +| Type | Method | Use For | +|------|--------|---------| +| **Counter** | `Sentry.metrics.count()` | Events that happen (orders, clicks, errors) | +| **Gauge** | `Sentry.metrics.gauge()` | Current state (queue depth, connections) | +| **Distribution** | `Sentry.metrics.distribution()` | Values that vary (latency, sizes, amounts) | + +Every metric is trace-connected. When a metric spikes, click into samples to see the exact trace that produced it. + +```javascript +Sentry.metrics.count("checkout.failed", 1, { + attributes: { + user_tier: "premium", + failure_reason: "payment_declined" + } +}); +``` + +## Where to Add Metrics + +These five categories give you the most visibility per line of code. + +### 1. Business Events (Counters) + +Track discrete events that matter to the business. These become your KPIs. + +```javascript +Sentry.metrics.count("checkout.completed", 1, { + attributes: { user_tier: "premium", payment_method: "card" } +}); + +Sentry.metrics.count("checkout.failed", 1, { + attributes: { user_tier: "premium", failure_reason: "payment_declined" } +}); +``` + +**How to explore:** +1. Go to **Explore > Metrics** +2. Select `checkout.failed`, set **Agg** to `sum` +3. **Group by** `failure_reason` +4. Click **Samples** to see individual events and their traces + +### 2. Application Health (Counters) + +Track success and failure of critical operations. + +```javascript +Sentry.metrics.count("email.sent", 1, { + attributes: { email_type: "welcome", provider: "sendgrid" } +}); + +Sentry.metrics.count("email.failed", 1, { + attributes: { email_type: "welcome", error: "rate_limited" } +}); + +Sentry.metrics.count("job.processed", 1, { + attributes: { job_type: "invoice-generation", queue: "billing" } +}); +``` + +**Explore:** Add both `email.sent` and `email.failed`, group by `email_type`, compare the ratio. + +### 3. Resource Utilization (Gauges) + +Track current state of pools, queues, and connections. Call these periodically (e.g., every 30 seconds). + +```javascript +Sentry.metrics.gauge("queue.depth", await queue.size(), { + attributes: { queue_name: "notifications" } +}); + +Sentry.metrics.gauge("pool.connections_active", pool.activeConnections, { + attributes: { pool_name: "postgres-primary" } +}); +``` + +**Explore:** View `max(queue.depth)` over time to spot backlogs. + +### 4. Latency and Performance (Distributions) + +Track values that vary and need percentile analysis. Averages hide outliers—use p90/p95/p99. + +```javascript +Sentry.metrics.distribution("api.latency", responseTimeMs, { + unit: "millisecond", + attributes: { endpoint: "/api/orders", method: "POST" } +}); + +Sentry.metrics.distribution("db.query_time", queryDurationMs, { + unit: "millisecond", + attributes: { table: "orders", operation: "select" } +}); +``` + +**Explore:** View `p95(api.latency)` grouped by `endpoint` to find slow routes. + +### 5. Business Values (Distributions) + +Track amounts, sizes, and quantities for analysis. + +```javascript +Sentry.metrics.distribution("order.amount", order.totalUsd, { + unit: "usd", + attributes: { user_tier: "premium", region: "us-west" } +}); + +Sentry.metrics.distribution("upload.size", fileSizeBytes, { + unit: "byte", + attributes: { file_type: "image", source: "profile-update" } +}); +``` + +**Explore:** View `avg(order.amount)` grouped by `region` to compare regional performance. + +## The Debugging Flow + +When something looks off in metrics, here's how to find the cause: + +``` +Metric spike → Samples tab → Click a sample → Full trace → Related logs/errors → Root cause +``` + +This is the advantage of trace-connected metrics. Instead of "metric alert → guesswork," you get direct links to exactly what happened. + +## When to Use Metrics vs Traces vs Logs + +| Signal | Best For | Example Question | +|--------|----------|------------------| +| **Metrics** | Aggregated counts, rates, percentiles | "How many checkouts failed this hour?" | +| **Traces** | Request flow, latency breakdown | "Why was this specific request slow?" | +| **Logs** | Detailed context, debugging | "What happened right before this error?" | + +All three are trace-connected. Start wherever makes sense and navigate to the others. + +## Quick Reference + +| Category | Type | Metric Name Examples | Key Attributes | +|----------|------|---------------------|----------------| +| Business events | `count` | checkout.completed, checkout.failed | user_tier, failure_reason | +| App health | `count` | email.sent, job.processed | email_type, job_type | +| Resources | `gauge` | queue.depth, pool.connections_active | queue_name, pool_name | +| Latency | `distribution` | api.latency, db.query_time | endpoint, table, operation | +| Business values | `distribution` | order.amount, upload.size | user_tier, region, file_type | diff --git a/docs/product/explore/trace-explorer/custom-spans.mdx b/docs/product/explore/trace-explorer/custom-spans.mdx new file mode 100644 index 0000000000000..4fa4a2cf5abb4 --- /dev/null +++ b/docs/product/explore/trace-explorer/custom-spans.mdx @@ -0,0 +1,170 @@ +--- +title: "Adding Custom Spans" +sidebar_order: 15 +description: "Add custom instrumentation for visibility beyond auto-instrumentation." +--- + +Auto-instrumentation captures a lot, but some operations need manual spans. This guide covers when and how to add custom instrumentation. + +## When to Add Custom Spans + +Add custom spans when you need visibility that auto-instrumentation doesn't provide: + +- Business-critical flows (checkout, onboarding) +- Third-party API calls with custom context +- Database queries with business context +- Background job execution +- AI/LLM operations + +## The Pattern + +```javascript +Sentry.startSpan( + { name: "operation-name", op: "category" }, + async (span) => { + span.setAttribute("key", value); + // ... your code ... + } +); +``` + +Numeric attributes become metrics you can aggregate with `sum()`, `avg()`, `p90()` in Trace Explorer. + +## Where to Add Custom Spans + +### Business-Critical User Flows + +Track the full journey through critical paths. When checkout is slow, you need to know which step. + +```javascript +Sentry.startSpan( + { name: "checkout-flow", op: "user.action" }, + async (span) => { + span.setAttribute("cart.itemCount", 3); + span.setAttribute("user.tier", "premium"); + + await validateCart(); + await processPayment(); + await createOrder(); + } +); +``` + +**Query:** `span.op:user.action` grouped by `user.tier`, visualize `p90(span.duration)`. + +**Alert idea:** `p90(span.duration) > 10s` for checkout flows. + +### Third-Party API Calls + +Measure dependencies you don't control. They're often the source of slowdowns. + +```javascript +Sentry.startSpan( + { name: "shipping-rates-api", op: "http.client" }, + async (span) => { + span.setAttribute("http.url", "api.shipper.com/rates"); + span.setAttribute("request.itemCount", items.length); + + const start = Date.now(); + const response = await fetch("https://api.shipper.com/rates"); + + span.setAttribute("http.status_code", response.status); + span.setAttribute("response.timeMs", Date.now() - start); + + return response.json(); + } +); +``` + +**Query:** `span.op:http.client` + `response.timeMs:>2000` to find slow external calls. + +**Alert idea:** `p95(span.duration) > 3s` where `http.url` contains your critical dependencies. + +### Database Queries with Business Context + +Auto-instrumentation catches queries, but custom spans let you add context that explains why a query matters. + +```javascript +Sentry.startSpan( + { name: "load-user-dashboard", op: "db.query" }, + async (span) => { + span.setAttribute("db.system", "postgres"); + span.setAttribute("query.type", "aggregation"); + span.setAttribute("query.dateRange", "30d"); + + const results = await db.query(dashboardQuery); + span.setAttribute("result.rowCount", results.length); + + return results; + } +); +``` + +**Why this matters:** Without these attributes, you see "a database query took 2 seconds." With them, you know it was aggregating 30 days of data and returned 50,000 rows. That's actionable. + +**Query ideas:** +- "Which aggregation queries are slowest?" → Group by `query.type`, sort by `p90(span.duration)` +- "Does date range affect performance?" → Filter by name, group by `query.dateRange` + +### Background Jobs + +Jobs run outside request context. Custom spans make them visible. + +```javascript +async function processEmailDigest(job) { + return Sentry.startSpan( + { name: `job:${job.type}`, op: "queue.process" }, + async (span) => { + span.setAttribute("job.id", job.id); + span.setAttribute("job.type", "email-digest"); + span.setAttribute("queue.name", "notifications"); + + const users = await getDigestRecipients(); + span.setAttribute("job.recipientCount", users.length); + + for (const user of users) { + await sendDigest(user); + } + + span.setAttribute("job.status", "completed"); + } + ); +} +``` + +**Query:** `span.op:queue.process` grouped by `job.type`, visualize `p90(span.duration)`. + +**Alert idea:** `p90(span.duration) > 60s` for queue processing. + +### AI/LLM Operations + +For AI workloads, use [Sentry Agent Monitoring](/product/insights/ai/agents/) instead of manual instrumentation when possible. It automatically captures agent workflows, tool calls, and token usage. + +If you're not using a supported framework or need custom attributes: + +```javascript +Sentry.startSpan( + { name: "generate-summary", op: "ai.inference" }, + async (span) => { + span.setAttribute("ai.model", "gpt-4"); + span.setAttribute("ai.feature", "document-summary"); + + const response = await openai.chat.completions.create({...}); + + span.setAttribute("ai.tokens.total", response.usage.total_tokens); + return response; + } +); +``` + +**Alert idea:** `p95(span.duration) > 5s` for AI inference. + +## Quick Reference + +| Category | `op` Value | Key Attributes | +|----------|-----------|----------------| +| User flows | `user.action` | cart.itemCount, user.tier | +| External APIs | `http.client` | http.url, response.timeMs | +| Database | `db.query` | query.type, result.rowCount | +| Background jobs | `queue.process` | job.type, job.id, queue.name | +| AI/LLM | `ai.inference` | ai.model, ai.tokens.total | diff --git a/docs/product/explore/trace-explorer/querying-traces.mdx b/docs/product/explore/trace-explorer/querying-traces.mdx new file mode 100644 index 0000000000000..aac272f8f879e --- /dev/null +++ b/docs/product/explore/trace-explorer/querying-traces.mdx @@ -0,0 +1,96 @@ +--- +title: "Querying Auto-Instrumented Traces" +sidebar_order: 10 +description: "Find performance issues using data Sentry captures automatically." +--- + +Sentry's browser SDK automatically captures page loads, navigation, fetch requests, and resource loading. This guide shows you how to query that data to find performance issues—no custom code required. + +## What's Auto-Instrumented + +With `browserTracingIntegration()` enabled, Sentry automatically captures: + +- Page loads and navigation +- Fetch/XHR requests +- Long animation frames (main thread blocking) +- Resource loading (JS, CSS, images) + +## Finding Performance Issues + +### Slow Page Loads + +Pages taking too long to become interactive. + +**Query:** `span.op:pageload` + +**Visualize:** `p90(span.duration)` grouped by `transaction` to compare routes. + +**Alert idea:** `p75(transaction.duration) > 3s` for pageload transactions. + +### Slow API Calls + +Fetch/XHR requests dragging down your app. + +**Query:** `span.op:http.client` + +**Visualize:** `avg(span.duration)` grouped by `span.description` (the URL). + +**Alert idea:** `p95(span.duration) > 2s` where `span.op:http.client`. + +### JavaScript Blocking the Main Thread + +Long animation frames indicate JavaScript execution blocking the UI. + +**Query:** `span.op:ui.long-animation-frame` + +**Visualize:** Sort by `span.duration` to find the worst offenders. + +**Alert idea:** `p75(span.duration) > 200ms` for long animation frames. + +### Slow SPA Navigation + +How long it takes users to move between pages after initial load. + +**Query:** `span.op:navigation` + +**Visualize:** `p90(span.duration)` grouped by `transaction` to compare route performance. + +**Alert idea:** `p75(span.duration) > 2s` for navigation. + +### Heavy Resources + +JS bundles, stylesheets, or images slowing down page load. + +**Queries:** +- `span.op:resource.script` (JavaScript) +- `span.op:resource.css` (stylesheets) +- `span.op:resource.img` (images) + +**Visualize:** `avg(span.duration)` to find the heaviest assets. + +**Alert idea:** `p75(span.duration) > 3s` for resource.script. + +## Creating Trace-Based Alerts + +1. Go to **Explore > Traces** +2. Build your query (e.g., `span.op:http.client`) +3. Click **Save As** → **Alert** +4. Choose a threshold type: + - **Static:** Set a specific threshold + - **Percent Change:** Alert on relative change + - **Anomaly:** Let Sentry detect unusual patterns +5. Configure notifications and save + +**Tip:** Alert on percentiles (p75, p95) rather than averages. Averages hide the outliers that hurt real users. + +## Quick Reference + +| What You're Looking For | Query | Visualize | +|------------------------|-------|-----------| +| Slow page loads | `span.op:pageload` | `p90(span.duration)` by route | +| Slow fetch requests | `span.op:http.client` | `avg(span.duration)` by URL | +| JS blocking UI | `span.op:ui.long-animation-frame` | `max(span.duration)` | +| Slow SPA navigation | `span.op:navigation` | `p90(span.duration)` by route | +| Heavy JS bundles | `span.op:resource.script` | `avg(span.duration)` | +| Heavy stylesheets | `span.op:resource.css` | `avg(span.duration)` | +| Slow images | `span.op:resource.img` | `avg(span.duration)` |