Keywords: error handling, Go errors, error context, error codes, retry logic, structured logging, AI automation, machine-readable errors, MCP integration, distributed tracing
errific is an AI-ready error handling library for Go that provides structured context, error codes, retry metadata, and JSON serialization for automated error handling and decision-making.
When to use errific: Use errific when you need machine-readable errors with structured metadata for AI agents, automated retry logic, structured logging, or API error responses.
Purpose: Define reusable, testable error types with automatic caller information.
Why this matters:
- Type Safety: Use with
errors.Is()for reliable error checking across your codebase - Reusability: Define once at package level, use everywhere consistently
- Testability: Mock and assert error types easily without string comparison
- Debugging: Automatic caller information (file:line.function) captured at error creation
- Zero Allocation: Defining errors as constants has no runtime allocation overhead
Usage Pattern: Declare as package-level constants for type safety and testing with errors.Is().
When to use:
- ✅ Defining package-level or application-level errors
- ✅ When you need
errors.Is()compatibility for error checking - ✅ When caller information is valuable for debugging
- ✅ When you want consistent error messages across your codebase
Example - Basic:
// Use Case: Database connection error
// Keywords: database, connection, typed-error
var ErrDatabaseConnection Err = "database connection failed"
// Style 1: Explicit .New() (use when wrapping errors or caller info matters)
err := ErrDatabaseConnection.New(sqlErr)
// Style 2: Concise (recommended for new code without wrapped errors)
err := ErrDatabaseConnection.WithCode("DB_001").WithHTTPStatus(500)
// Returns: database connection failed [myapp/db.go:42.Connect]Example - With Wrapped Error:
// Use Case: Wrapping underlying database driver errors
// Keywords: error-wrapping, error-chain
sqlErr := sql.Open("postgres", connString)
if sqlErr != nil {
return ErrDatabaseConnection.New(sqlErr)
}
// Returns: database connection failed: pq: connection refused [myapp/db.go:42.Connect]Example - Testing with errors.Is():
// Use Case: Reliable error type checking in tests
// Keywords: testing, error-checking, errors-is
func TestDatabaseError(t *testing.T) {
err := connectDatabase()
// ✅ Correct: Type-safe error checking
if !errors.Is(err, ErrDatabaseConnection) {
t.Error("expected database connection error")
}
// ❌ Incorrect: String comparison (fragile)
if err.Error() != "database connection failed" {
// Breaks if caller info changes
}
}Methods:
New(errs ...error) errific- Create error with optional wrapped errorsErrorf(a ...any) errific- Create formatted error (format in Err string)Withf(format string, a ...any) errific- Append formatted messageWrapf(format string, a ...any) errific- Wrap with formatted message
Forwarding Methods: All With() methods (WithCode, WithHTTPStatus, etc.) can now be called directly on Err without calling .New() first. This provides a more concise API while maintaining full backwards compatibility.
How It Works: The first With() method automatically calls .New() internally, then returns an errific instance. Subsequent methods in the chain operate on that instance directly, so .New() is only called once per error.
Testing: Use errors.Is(err, ErrDatabaseConnection) for assertions.
Purpose: Attach structured metadata to errors for debugging, logging, and AI decision-making.
Why this matters:
- Debugging: See exact parameters, state, and conditions that caused the error
- Monitoring: Extract metrics from error context (duration, size, count) for dashboards
- AI Decision-Making: Agents can read context values to decide actions (e.g., retry if duration > threshold)
- Compliance: Track required audit fields (user_id, transaction_id, ip_address)
- Root Cause Analysis: Context preserves state snapshot at error time
- JSON Serialization: Context maps directly to JSON for logging systems
Usage Pattern: Add context that helps diagnose the error or make retry decisions.
When to include:
- ✅ Operation parameters: query, endpoint, file_path, command
- ✅ Identifiers: user_id, request_id, transaction_id, correlation_id, session_id
- ✅ Measurements: duration_ms, size_bytes, retry_count, timeout_ms
- ✅ State information: current_step, pool_size, queue_depth, connection_count
- ✅ Diagnostic data: status_code, error_code, response_time
When to exclude:
- ❌ Sensitive data: passwords, tokens, API keys, credit cards, PII
- ❌ Large data: full request/response bodies (>1KB), binary data, images
- ❌ Non-JSON types: channels, functions, unsafe pointers, goroutines
- ❌ Redundant data: already in error message or obvious from error type
Example - Database Query:
// Use Case: Debugging slow database queries
// Keywords: database, query, performance, debugging
Context{
"query": "SELECT * FROM users WHERE status = ?",
"duration_ms": 1534, // Slow! Helps identify performance issue
"table": "users",
"connection_id": "conn-123",
"rows_affected": 0, // Shows query returned nothing
"params": []interface{}{"active"},
}Example - API Call:
// Use Case: Debugging API timeout errors
// Keywords: api, http, timeout, monitoring
Context{
"endpoint": "https://api.example.com/v1/users",
"method": "POST",
"status_code": 504, // Gateway timeout
"duration_ms": 30000, // 30 seconds = timeout threshold
"retry_count": 2, // Already retried twice
"request_id": "req-abc-123", // Trace across logs
"response_size": 0, // No response received
}Example - AI Agent Reading Context:
// Use Case: AI agent makes decision based on context
// Keywords: ai-automation, decision-making, context-analysis
if ctx := GetContext(err); ctx != nil {
// Agent: "Query took 1534ms, this is a slow query issue"
if duration, ok := ctx["duration_ms"].(int); ok && duration > 1000 {
log.Warn("slow query detected",
"duration", duration,
"query", ctx["query"])
// AI decision: Add to slow query monitoring
slowQueryMonitor.Track(ctx)
}
// Agent: "API call failed after 2 retries, don't retry again"
if retryCount, ok := ctx["retry_count"].(int); ok && retryCount >= 2 {
// Don't retry, max attempts reached
return err
}
}Example - File Operations:
// Use Case: File operation errors with context
// Keywords: file-io, file-operations, permissions
Context{
"path": "/data/uploads/file.txt",
"operation": "read",
"size_bytes": 1048576, // 1MB file
"permissions": "0644",
"owner": "www-data",
"exists": true, // File exists but can't read
}Retrieval: Use GetContext(err) to extract from any error.
Best Practices:
- Include quantitative data (durations, counts, sizes)
- Include identifiers (IDs, names, keys)
- Avoid sensitive data (passwords, tokens)
- Keep values JSON-serializable
- Use consistent key names across your application
- Limit context size to <100 keys per error
Purpose: Classify errors for automated routing and handling.
Why this matters:
- Automated Routing: AI agents can route errors to appropriate handlers without hardcoding error types
- HTTP Mapping: Automatically map errors to correct HTTP status codes in API responses
- Retry Decisions: Categories indicate whether errors are retryable (network=yes, validation=no)
- Logging Severity: Different categories map to different log levels (server=error, validation=warn)
- Alerting: Alert different teams based on category (server→ops, validation→product)
- Monitoring: Group errors by category in dashboards for better insights
Available Categories:
| Category | Use Case | HTTP Status | Retryable | Example |
|---|---|---|---|---|
CategoryClient |
User input errors | 400-499 | ❌ No | Invalid email format |
CategoryServer |
Internal failures | 500-599 | ✅ Maybe | Database connection failed |
CategoryNetwork |
Connectivity issues | 503, 504 | ✅ Yes | Connection timeout |
CategoryValidation |
Input validation | 400, 422 | ❌ No | Missing required field |
CategoryNotFound |
Resource missing | 404 | ❌ No | User ID not found |
CategoryUnauthorized |
Auth failures | 401, 403 | ❌ No | Invalid API key |
CategoryTimeout |
Timeout errors | 408, 504 | ✅ Yes | Request exceeded deadline |
Usage Pattern: Set category based on error type for automated handling.
When to use each category:
CategoryValidation - Use when:
- ✅ User provided invalid input (email, phone, format)
- ✅ Required field is missing
- ✅ Value doesn't match constraints (min/max, regex)
- ✅ Business rule violation (duplicate username)
CategoryClient - Use when:
- ✅ General client-side error not covered by other categories
- ✅ Malformed request
- ✅ Unsupported media type
- ✅ Request too large
CategoryUnauthorized - Use when:
- ✅ Missing or invalid authentication token
- ✅ Insufficient permissions
- ✅ Expired session
- ✅ API key revoked
CategoryNotFound - Use when:
- ✅ Resource doesn't exist (user, order, file)
- ✅ Endpoint doesn't exist (404)
- ✅ Record deleted
CategoryTimeout - Use when:
- ✅ Operation exceeded deadline
- ✅ Client timeout
- ✅ Gateway timeout
- ✅ Context deadline exceeded
CategoryNetwork - Use when:
- ✅ Connection refused
- ✅ DNS resolution failed
- ✅ Network unreachable
- ✅ Service temporarily unavailable
CategoryServer - Use when:
- ✅ Internal server error
- ✅ Database query failed
- ✅ File system error
- ✅ Panic recovered
Example - Basic Usage:
// Use Case: Categorize API timeout for automated handling
// Keywords: category, timeout, api-error, automation
err := ErrAPICall.WithCategory(CategoryTimeout)
// AI agent can route based on category
switch GetCategory(err) {
case CategoryNetwork:
// Retry immediately (network might recover)
time.Sleep(1 * time.Second)
return retry()
case CategoryTimeout:
// Retry with longer timeout
ctx, cancel := context.WithTimeout(ctx, 60*time.Second)
defer cancel()
return retryWithContext(ctx)
case CategoryValidation:
// Don't retry, return 400 to client
w.WriteHeader(400)
json.NewEncoder(w).Encode(err)
return nil
}Example - HTTP Status Mapping:
// Use Case: Automatically map category to HTTP status
// Keywords: http-mapping, api-error, status-code
err := ErrUserNotFound.New().WithCategory(CategoryNotFound)
// Get HTTP status based on category
status := GetHTTPStatus(err)
if status == 0 {
// No explicit status, use category default
switch GetCategory(err) {
case CategoryNotFound:
status = 404
case CategoryUnauthorized:
status = 401
case CategoryValidation:
status = 400
case CategoryTimeout:
status = 504
default:
status = 500
}
}
w.WriteHeader(status)Example - Logging Severity:
// Use Case: Set log level based on error category
// Keywords: logging, severity, monitoring
func logError(err error) {
var level string
switch GetCategory(err) {
case CategoryValidation, CategoryClient:
level = "warn" // User errors, not critical
case CategoryServer, CategoryNetwork:
level = "error" // System errors, needs attention
default:
level = "info"
}
logger.Log(level, err.Error(),
"category", GetCategory(err),
"code", GetCode(err))
}errific supports two equivalent API styles for creating errors:
When to use:
- Wrapping other errors:
ErrDatabase.New(sqlErr) - When caller information is critical for debugging
- When porting from existing code
Example:
err := ErrDatabase.New(sqlErr).
WithCode("DB_001").
WithHTTPStatus(500)When to use (recommended):
- Creating new errors without wrapping
- When code brevity is preferred
- New code and modern Go projects
Example:
err := ErrDatabase.
WithCode("DB_001").
WithHTTPStatus(500)Key Difference: The concise style automatically calls .New() on the first With() method. Both styles produce equivalent errors with the same metadata.
Performance: .New() is called exactly once regardless of how many methods are chained. Subsequent methods operate directly on the errific instance with zero overhead.
Purpose: Add structured debugging metadata.
Why this matters:
- Root Cause Analysis: Preserve exact state and parameters when error occurred
- Metrics Extraction: Pull duration, count, size from errors for monitoring dashboards
- AI Decisions: Agents read context to make intelligent decisions (retry if slow, alert if large)
- Audit Trails: Track user_id, transaction_id, request_id for compliance
- Performance Debugging: Identify slow operations by analyzing duration_ms in context
Parameters:
ctx Context- Map of key-value pairs with error context
Returns: errific error with context attached
Chaining: Can be chained with other methods
When to use:
- ✅ Database operations (include query, duration, table name)
- ✅ API calls (include endpoint, method, status code, duration)
- ✅ File operations (include path, operation, size, permissions)
- ✅ Business logic (include user_id, order_id, transaction_id)
- ✅ Any operation where state/parameters help debugging
When NOT to use:
- ❌ For sensitive data (use sanitized versions or omit)
- ❌ For large payloads (summarize instead)
- ❌ For obvious information (already in error message)
Example (both styles work):
// Use Case: Database query with performance tracking
// Keywords: database, context, performance, debugging
// Explicit style
err := ErrQuery.New().WithContext(Context{
"query": sql,
"duration_ms": elapsed.Milliseconds(),
"table": "users",
"rows_affected": count,
})
// Concise style (recommended)
err := ErrQuery.WithContext(Context{
"query": sql,
"duration_ms": elapsed.Milliseconds(),
})Example - API Call with Retry Context:
// Use Case: Track API call failures for retry decisions
// Keywords: api, http, retry-context, monitoring
err := ErrAPICall.New(httpErr).WithContext(Context{
"endpoint": "https://api.example.com/v1/payment",
"method": "POST",
"status_code": resp.StatusCode,
"duration_ms": elapsed.Milliseconds(),
"retry_attempt": attempt, // Track which retry this is
"timeout_ms": 30000,
"request_id": reqID,
})
// AI Agent Decision Based on Context:
if ctx := GetContext(err); ctx != nil {
if attempt, ok := ctx["retry_attempt"].(int); ok && attempt >= 3 {
// Already retried 3 times, don't retry again
return err
}
if duration, ok := ctx["duration_ms"].(int); ok && duration > 25000 {
// Slow response, might need longer timeout next time
return retryWithTimeout(60 * time.Second)
}
}Retrieval: GetContext(err) Context
Use Cases:
- Database queries (query, duration, table)
- API calls (endpoint, status, duration)
- File operations (path, size, permissions)
- Business logic (user_id, order_id, amount)
Purpose: Add machine-readable error code for routing and identification.
Why this matters:
- Error Tracking: Unique codes make errors searchable in Sentry, Rollbar, Datadog
- Automated Alerting: Route alerts to specific teams based on error code prefix (DB_, API_, etc.)
- Metric Aggregation: Group errors by code for dashboards ("API_TIMEOUT_001: 1,247 today")
- Documentation Links: Map error codes to documentation pages automatically
- Debugging: Filter logs by error code to find all occurrences of specific issue
- API Consistency: Return consistent error codes to API clients for reliable error handling
Parameters:
code string- Unique error code (e.g., "DB_CONN_001")- Empty string is ignored (no-op)
Returns: errific error with code attached
Naming Convention: DOMAIN_TYPE_NUMBER (e.g., "API_TIMEOUT_001")
When to use:
- ✅ Errors that need tracking/monitoring in error tracking systems
- ✅ Errors that trigger specific team alerts
- ✅ Errors that map to documentation pages
- ✅ Public API errors (consistent client experience)
- ✅ Errors used for metrics/dashboards
When NOT to use:
- ❌ One-off internal errors with no tracking
- ❌ Overly generic codes (e.g., "ERROR_001")
- ❌ Test-only errors
Example - Database Error with Code:
// Use Case: Track database connection pool exhaustion for alerts
// Keywords: error-code, database, tracking, alerting, monitoring
err := ErrDatabase.WithCode("DB_POOL_EXHAUSTED").
WithCategory(CategoryServer).
WithHTTPStatus(503)
// Monitoring system can alert based on code
if GetCode(err) == "DB_POOL_EXHAUSTED" {
alert.Send("dba-team", "Database pool exhausted", err)
}Common Code Prefixes:
DB_* → Database errors (DB_CONN_001, DB_QUERY_TIMEOUT_001)
API_* → External API errors (API_TIMEOUT_001, API_AUTH_FAILED_001)
VAL_* → Validation errors (VAL_EMAIL_INVALID_001, VAL_REQUIRED_FIELD_001)
AUTH_* → Authentication errors (AUTH_TOKEN_EXPIRED_001, AUTH_INVALID_CREDENTIALS_001)
Retrieval: GetCode(err) string
Purpose: Classify error for automated handling decisions.
Parameters:
category Category- One of the predefined categories
Returns: errific error with category
Decision Logic:
CategoryClient → Don't retry, return 4xx
CategoryServer → Retry with backoff, return 5xx
CategoryNetwork → Retry immediately, return 503
CategoryValidation → Don't retry, return 400
CategoryTimeout → Retry with increased timeout
Example:
err := ErrRateLimit.New().WithCategory(CategoryClient)Retrieval: GetCategory(err) Category
Purpose: Mark whether error should be retried.
Why this matters:
- Prevents Infinite Loops: Marking validation errors as non-retryable prevents retry storms
- Improves Resilience: Transient errors (network, timeout) marked retryable enable automatic recovery
- Saves Resources: Non-retryable errors fail fast instead of wasting retries
- AI Automation: Agents can implement retry logic without hardcoding error types
- Circuit Breaker Integration: Retryable flag helps circuit breakers decide when to open
Parameters:
retryable bool- true if error is transient and can be retried
Returns: errific error with retry flag
When to set true (retryable):
- ✅ Network timeouts (connection timeout, read timeout)
- ✅ Rate limits (HTTP 429, with retry-after)
- ✅ Temporary service unavailability (HTTP 503)
- ✅ Connection pool exhausted (will free up)
- ✅ Deadlock detected (might succeed on retry)
- ✅ Transient database errors (connection refused)
When to set false (non-retryable):
- ❌ Validation failures (won't fix themselves)
- ❌ Authentication errors (need new credentials)
- ❌ Authorization failures (permissions won't change)
- ❌ Resource not found (404)
- ❌ Malformed requests (400)
- ❌ Business logic violations
Example - Retryable Network Error:
// Use Case: Network timeout should be retried
// Keywords: retry, network, timeout, transient
err := ErrTimeout.New(netErr).
WithRetryable(true). // ✅ Network errors are transient
WithRetryAfter(5 * time.Second). // Wait 5s for network to recover
WithMaxRetries(3). // Try up to 3 times
WithCategory(CategoryNetwork)
// AI Agent Usage:
if IsRetryable(err) {
delay := GetRetryAfter(err)
if delay == 0 {
delay = time.Second // Default delay
}
time.Sleep(delay)
return retry()
}Example - Non-Retryable Validation Error:
// Use Case: Validation errors should NOT be retried
// Keywords: validation, non-retryable, user-error
err := ErrInvalidEmail.New().
WithRetryable(false). // ❌ User input won't fix itself
WithCategory(CategoryValidation).
WithHTTPStatus(400).
WithContext(Context{
"field": "email",
"value": "invalid-email", // Missing @ sign
"constraint": "must contain @",
})
// AI Agent Usage:
if !IsRetryable(err) {
// Don't retry, return error to user immediately
w.WriteHeader(GetHTTPStatus(err))
json.NewEncoder(w).Encode(err)
return
}Example - Rate Limit (Retryable with Delay):
// Use Case: Rate limit should be retried after delay
// Keywords: rate-limit, retry-after, backoff
err := ErrRateLimit.New().
WithRetryable(true). // ✅ Temporary, will reset
WithRetryAfter(30 * time.Second). // Wait for rate limit window
WithMaxRetries(1). // Only retry once (avoid ban)
WithHTTPStatus(429).
WithContext(Context{
"limit": "100/hour",
"reset_at": time.Now().Add(30*time.Minute).Unix(),
})Decision Tree:
Is error retryable?
├─ User input error? → NO (validation won't fix itself)
├─ Auth/permission error? → NO (credentials won't change)
├─ Not found error? → NO (resource won't appear)
├─ Network error? → YES (network might recover)
├─ Timeout error? → YES (might succeed with more time)
├─ Rate limit? → YES (will reset after window)
├─ Resource exhausted? → YES (resources will free up)
└─ Server error? → MAYBE (depends on cause)
Common Mistakes:
// ❌ DON'T: Retry validation errors (infinite loop)
err := ErrInvalidInput.New().WithRetryable(true) // WRONG!
// ✅ DO: Mark validation as non-retryable
err := ErrInvalidInput.New().WithRetryable(false)
// ❌ DON'T: Retry without delay (spam)
err := ErrNetwork.New().WithRetryable(true) // Missing RetryAfter!
// ✅ DO: Include retry delay
err := ErrNetwork.New().
WithRetryable(true).
WithRetryAfter(5 * time.Second)Retrieval: IsRetryable(err) bool
Purpose: Suggest delay before retry attempt.
Why this matters:
- Respects Rate Limits: Honor server's Retry-After header to avoid bans
- Prevents Retry Storms: Delays prevent overwhelming recovering services
- Improves Success Rate: Waiting gives transient issues time to resolve
- Circuit Breaker Integration: Delays help circuit breakers stay open long enough
- Resource Management: Prevents wasting resources on immediate re-failure
Parameters:
duration time.Duration- Time to wait before retry- Negative values are normalized to 0
Returns: errific error with retry delay
When to use:
- ✅ Rate limit errors (use Retry-After header value)
- ✅ Network timeouts (give network time to recover)
- ✅ Service unavailable (wait for service to restart)
- ✅ Resource exhaustion (wait for resources to free up)
- ✅ Any retryable error with WithRetryable(true)
When NOT to use:
- ❌ Non-retryable errors (validation, not found, auth)
- ❌ When using external backoff library (conflicts)
Common Values:
1 * time.Second- Fast retry for transient issues5 * time.Second- Standard retry delay30 * time.Second- Rate limit backoff5 * time.Minute- Long delay for maintenance
Example - Rate Limit with Retry-After Header:
// Use Case: Respect server's rate limit window
// Keywords: rate-limit, retry-after, http-429, backoff
// Parse Retry-After header from HTTP 429 response
retryAfterSec := resp.Header.Get("Retry-After")
retryAfter, _ := time.ParseDuration(retryAfterSec + "s")
err := ErrRateLimit.New().
WithRetryable(true).
WithRetryAfter(retryAfter). // Use server's suggested delay
WithMaxRetries(1). // Only retry once (avoid ban)
WithHTTPStatus(429)
// AI Agent automatically waits
if IsRetryable(err) {
delay := GetRetryAfter(err)
log.Info("Rate limited, waiting", "delay", delay)
time.Sleep(delay)
return retry()
}Example - Exponential Backoff:
// Use Case: Retry with increasing delays for transient failures
// Keywords: exponential-backoff, retry-strategy, resilience
func retryWithBackoff(operation func() error) error {
baseDelay := 2 * time.Second
for attempt := 0; attempt < 5; attempt++ {
err := operation()
if err == nil {
return nil
}
if !IsRetryable(err) {
return err
}
// Exponential backoff: 2s, 4s, 8s, 16s, 32s
delay := GetRetryAfter(err)
if delay == 0 {
delay = baseDelay * time.Duration(1<<attempt)
}
log.Info("Retrying with backoff",
"attempt", attempt+1,
"delay", delay)
time.Sleep(delay)
}
return err
}Example - Service Unavailable:
// Use Case: Wait for service to recover from deployment
// Keywords: service-unavailable, deployment, recovery
err := ErrServiceUnavailable.New().
WithRetryable(true).
WithRetryAfter(10 * time.Second). // Wait for service restart
WithMaxRetries(6). // Try for 1 minute total
WithHTTPStatus(503)
// Output: Will retry after 10s, up to 6 timesCommon Mistakes:
// ❌ DON'T: Set retryable without delay (immediate retry)
err := ErrNetwork.New().WithRetryable(true) // Missing RetryAfter!
// ✅ DO: Always include delay with retryable errors
err := ErrNetwork.New().
WithRetryable(true).
WithRetryAfter(5 * time.Second)
// ❌ DON'T: Use negative duration (confusing)
err := ErrTimeout.New().WithRetryAfter(-5 * time.Second) // Normalized to 0
// ✅ DO: Use positive duration
err := ErrTimeout.New().WithRetryAfter(5 * time.Second)Retrieval: GetRetryAfter(err) time.Duration
Purpose: Set maximum retry attempts to prevent infinite loops.
Why this matters:
- Prevents Infinite Loops: Cap retries so errors eventually fail instead of retry forever
- Resource Protection: Limit wasted compute/network resources on failing operations
- Fast Failure Detection: Fail after reasonable attempts instead of hanging indefinitely
- Cost Control: Prevent runaway API costs from excessive retry attempts
- User Experience: Bound retry time so users don't wait forever
Parameters:
max int- Maximum number of retry attempts (0 = no retries)- Negative values are normalized to 0
Returns: errific error with max retries
When to use:
- ✅ All retryable errors (always set a limit)
- ✅ Critical operations (higher limit = 5)
- ✅ Standard operations (3 retries is typical)
- ✅ Expensive operations (1 retry to limit cost)
- ✅ Background jobs (higher limit = 10)
When NOT to use:
- ❌ Non-retryable errors (set WithRetryable(false) instead)
- ❌ Health checks (use 0, need immediate result)
Recommended Values:
0- No retries (health checks, immediate failure needed)1- Single retry (expensive operations, non-idempotent)3- Standard retry limit (most operations)5- Aggressive retry (critical operations, idempotent)10- Background jobs (can wait, eventual consistency)
Example - Standard API Call:
// Use Case: Retry API call with reasonable limit
// Keywords: api-retry, retry-limit, standard-operation
err := ErrAPI.New(httpErr).
WithRetryable(true).
WithRetryAfter(5 * time.Second).
WithMaxRetries(3). // Try 3 times max
WithHTTPStatus(504)
// AI Agent implements retry loop with limit
for attempt := 0; attempt < GetMaxRetries(err); attempt++ {
err := callAPI()
if err == nil {
return nil // Success!
}
if !IsRetryable(err) || attempt >= GetMaxRetries(err)-1 {
return err // Failed or max retries reached
}
time.Sleep(GetRetryAfter(err))
}Example - Critical Operation with Higher Limit:
// Use Case: Payment processing must succeed if possible
// Keywords: critical-operation, payment, high-retry-limit
err := ErrPaymentGateway.New(gatewayErr).
WithRetryable(true).
WithRetryAfter(10 * time.Second).
WithMaxRetries(5). // Higher limit for critical operation
WithContext(Context{
"transaction_id": txID,
"amount": amount,
"idempotency_key": idempotencyKey, // Safe to retry
})
// Output: Will retry up to 5 times with 10s delayExample - Expensive Operation with Low Limit:
// Use Case: ML model inference is expensive, limit retries
// Keywords: expensive-operation, ml-inference, cost-control
err := ErrMLInference.New(inferenceErr).
WithRetryable(true).
WithRetryAfter(2 * time.Second).
WithMaxRetries(1). // Only retry once (expensive)
WithContext(Context{
"model": "gpt-4",
"tokens": 5000,
"cost_usd": 0.50,
})
// Output: Will only retry once to limit costsExample - Background Job with High Limit:
// Use Case: Background job can retry many times
// Keywords: background-job, eventual-consistency, high-retry
err := ErrEmailSend.New(smtpErr).
WithRetryable(true).
WithRetryAfter(30 * time.Second).
WithMaxRetries(10). // Background job, can wait
WithContext(Context{
"email": recipient,
"template": "welcome",
})
// Output: Will retry up to 10 times over 5 minutesCommon Mistakes:
// ❌ DON'T: Set retryable without max retries (unbounded)
err := ErrNetwork.New().WithRetryable(true) // Missing MaxRetries!
// ✅ DO: Always set a limit
err := ErrNetwork.New().
WithRetryable(true).
WithMaxRetries(3)
// ❌ DON'T: Set max retries too high for user-facing ops
err := ErrAPICall.New().WithMaxRetries(100) // User waits forever!
// ✅ DO: Use reasonable limits for user-facing operations
err := ErrAPICall.New().WithMaxRetries(2) // 2 retries = ~15s max
// ❌ DON'T: Set max retries on non-retryable errors (confusing)
err := ErrValidation.New().
WithRetryable(false).
WithMaxRetries(3) // Ignored, but confusing
// ✅ DO: Only set max retries on retryable errors
err := ErrValidation.New().WithRetryable(false)Retrieval: GetMaxRetries(err) int
Purpose: Map error to HTTP status code for API responses.
Why this matters:
- Automatic Response Mapping: Errors carry their own HTTP status for consistent API responses
- Client Understanding: Proper status codes help clients handle errors correctly
- API Compliance: Meet HTTP specification and RESTful API conventions
- Error Categorization: Status codes group errors (4xx = client, 5xx = server)
- Monitoring: Track API errors by status code in dashboards
- Retry Decisions: Clients know which errors to retry based on 5xx vs 4xx
Parameters:
status int- HTTP status code (100-599, 0 = not set)- Panics if code is outside valid range (except 0)
Returns: errific error with HTTP status
When to use:
- ✅ All errors in HTTP/REST APIs
- ✅ All errors in web services
- ✅ Any error that might be returned to HTTP client
- ✅ When building API middleware or handlers
When NOT to use:
- ❌ Internal errors never exposed via HTTP
- ❌ CLI applications (no HTTP involved)
- ❌ Background jobs not triggered by HTTP
Common Mappings:
4xx - Client Errors (don't retry):
400 - Validation errors, bad request
401 - Authentication required
403 - Permission denied, forbidden
404 - Resource not found
408 - Client request timeout
409 - Conflict (duplicate, version mismatch)
422 - Unprocessable entity (semantic validation)
429 - Rate limit exceeded (retry with delay)
5xx - Server Errors (retry possible):
500 - Internal server error
502 - Bad gateway (upstream failure)
503 - Service unavailable (temporary)
504 - Gateway timeout (upstream timeout)
Example - Validation Error:
// Use Case: Return 400 for invalid user input
// Keywords: validation, bad-request, api-error, http-400
err := ErrValidation.New().
WithHTTPStatus(400).
WithCategory(CategoryValidation).
WithCode("VAL_EMAIL_INVALID").
WithContext(Context{
"field": "email",
"value": "invalid-email",
})
// Automatic HTTP response
w.WriteHeader(GetHTTPStatus(err)) // 400
json.NewEncoder(w).Encode(err)Example - Not Found:
// Use Case: Return 404 when resource doesn't exist
// Keywords: not-found, http-404, resource-missing
err := ErrUserNotFound.New().
WithHTTPStatus(404).
WithCategory(CategoryNotFound).
WithContext(Context{
"user_id": userID,
})
// Client knows resource doesn't exist, won't retryExample - Rate Limit:
// Use Case: Return 429 with Retry-After header
// Keywords: rate-limit, http-429, retry-after
retryAfter := 30 * time.Second
err := ErrRateLimit.New().
WithHTTPStatus(429).
WithRetryable(true).
WithRetryAfter(retryAfter).
WithContext(Context{
"limit": "100/hour",
"reset_at": time.Now().Add(retryAfter).Unix(),
})
// Set both status and Retry-After header
w.Header().Set("Retry-After", fmt.Sprintf("%.0f", retryAfter.Seconds()))
w.WriteHeader(GetHTTPStatus(err)) // 429
json.NewEncoder(w).Encode(err)Example - Server Error with Retry:
// Use Case: Database failure, return 503 so clients retry
// Keywords: service-unavailable, http-503, server-error
err := ErrDatabaseDown.New(dbErr).
WithHTTPStatus(503).
WithCategory(CategoryServer).
WithRetryable(true).
WithRetryAfter(10 * time.Second).
WithCode("DB_UNAVAILABLE")
// Clients see 503 and know to retryExample - Automatic Status from Category:
// Use Case: Fallback to category-based status if not set explicitly
// Keywords: automatic-mapping, category-to-status
err := ErrValidation.New().
WithCategory(CategoryValidation)
// No WithHTTPStatus() call
// Middleware can use category as fallback
status := GetHTTPStatus(err)
if status == 0 {
switch GetCategory(err) {
case CategoryValidation:
status = 400
case CategoryNotFound:
status = 404
case CategoryUnauthorized:
status = 401
case CategoryServer:
status = 500
default:
status = 500
}
}
w.WriteHeader(status)Common Mistakes:
// ❌ DON'T: Use 200 for errors (confusing)
err := ErrFailed.New().WithHTTPStatus(200) // WRONG!
// ✅ DO: Use appropriate error status (4xx or 5xx)
err := ErrFailed.New().WithHTTPStatus(500)
// ❌ DON'T: Use invalid status codes
err := ErrTest.New().WithHTTPStatus(999) // Panics!
// ✅ DO: Use valid HTTP status codes (100-599)
err := ErrTest.New().WithHTTPStatus(400)
// ❌ DON'T: Use 5xx for client errors
err := ErrInvalidInput.New().WithHTTPStatus(500) // Wrong category!
// ✅ DO: Use 4xx for client errors, 5xx for server errors
err := ErrInvalidInput.New().WithHTTPStatus(400)Retrieval: GetHTTPStatus(err) int
Purpose: Set MCP (Model Context Protocol) error code for LLM tool servers.
Why this matters:
- LLM Integration: Standard error codes that LLMs understand and can act on
- JSON-RPC 2.0 Compliance: Maps to official JSON-RPC 2.0 error codes
- Tool Server Development: Build MCP servers that communicate clearly with Claude and other LLMs
- Error Classification: LLMs can distinguish between method not found vs invalid params vs execution error
- Automatic Recovery: LLMs use MCP codes to decide recovery actions (retry, fix params, abort)
- Debugging: Trace errors through LLM → Tool Server → Backend with consistent codes
Parameters:
code int- MCP error code (see MCP constants below)
Returns: errific error with MCP code attached
Chaining: Can be chained with other methods
MCP Error Codes (from JSON-RPC 2.0 spec):
// Standard JSON-RPC 2.0 codes
MCPParseError = -32700 // Invalid JSON
MCPInvalidRequest = -32600 // Invalid request structure
MCPMethodNotFound = -32601 // Method doesn't exist
MCPInvalidParams = -32602 // Invalid method parameters
MCPInternalError = -32603 // Internal server error
// MCP-specific codes (Server Error range: -32000 to -32099)
MCPToolError = -32000 // Tool execution failed
MCPResourceError = -32001 // Resource access failed
MCPTimeoutError = -32002 // Operation timeout
MCPAuthError = -32003 // Authentication failedWhen to use:
- ✅ Building MCP tool servers for Claude or other LLMs
- ✅ When returning errors in JSON-RPC 2.0 format
- ✅ When LLMs need to distinguish error types programmatically
- ✅ For tools that need automatic retry/recovery logic
When NOT to use:
- ❌ In standard REST APIs (use WithHTTPStatus instead)
- ❌ In internal services not exposed to LLMs
- ❌ When you're not following JSON-RPC 2.0 protocol
Example 1 - Method Not Found:
// Use Case: LLM requested a tool that doesn't exist
// Keywords: mcp, tool-not-found, json-rpc, llm-integration
err := ErrToolNotFound.New().
WithMCPCode(errific.MCPMethodNotFound). // -32601
WithCategory(CategoryNotFound).
WithHelp("The requested tool is not available in this server").
WithSuggestion("Use the 'list_tools' method to see available tools").
WithDocs("https://docs.example.com/mcp/available-tools").
WithContext(Context{
"requested_tool": "search_web",
"available_tools": []string{"search_db", "send_email"},
})
// Converts to JSON-RPC 2.0 format
mcpErr := errific.ToMCPError(err)
// {
// "code": -32601,
// "message": "tool not found",
// "data": {
// "help": "The requested tool is not available...",
// "suggestion": "Use the 'list_tools' method...",
// ...
// }
// }Example 2 - Invalid Parameters:
// Use Case: LLM provided parameters that don't match tool schema
// Keywords: mcp, invalid-params, validation, schema
err := ErrInvalidToolParams.New().
WithMCPCode(errific.MCPInvalidParams). // -32602
WithCategory(CategoryValidation).
WithHTTPStatus(400).
WithHelp("The 'query' parameter is required but was not provided").
WithSuggestion("Include a 'query' parameter with your search term").
WithDocs("https://docs.example.com/mcp/tools/search#parameters").
WithContext(Context{
"tool": "search",
"required_params": []string{"query", "limit"},
"provided_params": []string{"limit"}, // Missing 'query'
"schema_url": "https://example.com/schema/search.json",
})
// LLM reads this and fixes the request automaticallyExample 3 - Tool Execution Error with Retry:
// Use Case: Tool execution failed due to transient database issue
// Keywords: mcp, tool-error, retryable, database
err := ErrToolExecution.New(dbErr).
WithMCPCode(errific.MCPToolError). // -32000
WithCategory(CategoryServer).
WithRetryable(true).
WithRetryAfter(5 * time.Second).
WithMaxRetries(3).
WithHelp("Database connection pool is exhausted. This is temporary.").
WithSuggestion("Retry in 5 seconds when connections are released").
WithCorrelationID(traceID).
WithContext(Context{
"tool": "search_database",
"database": "users_db",
"pool_size": 100,
"active_connections": 100,
})
// AI Agent Decision Logic:
mcpErr := errific.ToMCPError(err)
if mcpErr.Data["retryable"] == true {
delay := mcpErr.Data["retry_after"].(string) // "5s"
// LLM waits 5s and retries automatically
}Example 4 - Authentication Error (Non-Retryable):
// Use Case: LLM provided invalid API key for tool
// Keywords: mcp, auth-error, non-retryable, security
err := ErrAuthFailed.New().
WithMCPCode(errific.MCPAuthError). // -32003
WithCategory(CategoryUnauthorized).
WithHTTPStatus(401).
WithRetryable(false). // Don't retry, credentials won't change
WithHelp("The API key provided is invalid or expired").
WithSuggestion("Check that you're using a valid API key from your account settings").
WithDocs("https://docs.example.com/authentication#api-keys").
WithContext(Context{
"auth_method": "api_key",
"key_prefix": "sk_test_...", // Partial key for debugging
})
// LLM knows not to retry and should prompt user for new credentialsCommon Mistakes:
// ❌ DON'T: Use HTTP status codes as MCP codes
err := ErrNotFound.New().WithMCPCode(404) // Wrong! Use MCPMethodNotFound
// ✅ DO: Use MCP constants
err := ErrNotFound.New().WithMCPCode(errific.MCPMethodNotFound)
// ❌ DON'T: Use MCP codes in REST APIs
func restHandler(w http.ResponseWriter, r *http.Request) {
err := ErrFailed.New().WithMCPCode(errific.MCPToolError) // Wrong context!
// Use WithHTTPStatus for REST
}
// ✅ DO: Use MCP codes only for JSON-RPC 2.0 / MCP servers
func mcpHandler(req *MCPRequest) *MCPResponse {
err := ErrFailed.New().WithMCPCode(errific.MCPToolError)
return &MCPResponse{Error: errific.ToMCPError(err)}
}
// ❌ DON'T: Forget to include help and suggestions with MCP errors
err := ErrTool.New().WithMCPCode(errific.MCPToolError) // LLM can't recover!
// ✅ DO: Include help, suggestion, and docs for LLM recovery
err := ErrTool.New().
WithMCPCode(errific.MCPToolError).
WithHelp("What went wrong").
WithSuggestion("How to fix it").
WithDocs("Where to learn more")Retrieval: GetMCPCode(err) int
See Also:
WithHelp()- Add human-readable help message for LLMsWithSuggestion()- Add actionable recovery suggestionWithDocs()- Add documentation URLToMCPError()- Convert errific error to MCP format
Purpose: Add correlation/trace ID for distributed tracing across microservices.
Why this matters:
- Distributed Tracing: Track a single request across multiple services (Gateway → Auth → Database → Cache)
- Log Aggregation: Group all logs from one user request using the correlation ID
- Root Cause Analysis: Trace error back through entire service chain to find origin
- OpenTelemetry/Datadog Integration: Correlation ID links errific errors to traces in APM tools
- Customer Support: Search all logs for specific customer interaction using their correlation ID
- Performance Analysis: Measure total latency of request across all services
Parameters:
id string- Correlation/trace ID (often from OpenTelemetry, request headers, or generated UUID)
Returns: errific error with correlation ID attached
Chaining: Can be chained with other methods
When to use:
- ✅ Microservices architecture (track requests across services)
- ✅ When using OpenTelemetry, Datadog, or other tracing systems
- ✅ Multi-step workflows (payment processing, order fulfillment)
- ✅ Debugging distributed systems
- ✅ Customer support investigations
When NOT to use:
- ❌ Single monolithic application with no tracing
- ❌ When you already use WithRequestID (don't duplicate)
- ❌ For internal function-level errors (too granular)
Example 1 - Microservices Request Chain:
// Use Case: Track error through Gateway → User Service → Database
// Keywords: microservices, distributed-tracing, correlation-id, opentelemetry
// Gateway receives request with trace ID
correlationID := r.Header.Get("X-Correlation-ID")
if correlationID == "" {
correlationID = uuid.New().String()
}
// Gateway calls User Service
userResp, err := userService.GetUser(ctx, userID)
if err != nil {
return ErrUserServiceFailed.New(err).
WithCorrelationID(correlationID). // Pass through chain
WithRequestID(r.Header.Get("X-Request-ID")).
WithHTTPStatus(503).
WithRetryable(true).
WithContext(Context{
"service": "user-service",
"user_id": userID,
"gateway_host": "api-gw-01",
})
}
// User Service propagates to Database
dbErr := ErrDatabaseQuery.New(sqlErr).
WithCorrelationID(correlationID). // Same ID through entire chain
WithContext(Context{
"query": sql,
"service": "database",
})
// Later: Search logs for correlationID to see full request path
// Gateway (200ms) → User Service (150ms) → Database ERRORExample 2 - OpenTelemetry Integration:
// Use Case: Link errific errors to OpenTelemetry traces
// Keywords: opentelemetry, tracing, spans, distributed-tracing
import (
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/trace"
)
func processOrder(ctx context.Context, orderID string) error {
// Extract trace ID from OpenTelemetry context
span := trace.SpanFromContext(ctx)
traceID := span.SpanContext().TraceID().String()
// Create payment
err := paymentService.Charge(ctx, orderID)
if err != nil {
return ErrPaymentFailed.New(err).
WithCorrelationID(traceID). // Link to OTel trace
WithContext(Context{
"order_id": orderID,
"span_id": span.SpanContext().SpanID().String(),
"trace_id": traceID,
})
}
// Now you can:
// 1. Find error in errific logs using correlation_id
// 2. Search Datadog/Jaeger for same trace_id
// 3. See full distributed trace with error context
return nil
}Example 3 - Customer Support Investigation:
// Use Case: Customer reports error, support needs to find all related logs
// Keywords: customer-support, debugging, log-aggregation
// Customer sees error ID: "corr-abc-123-def"
// Support engineer searches logs
err := ErrCheckoutFailed.New().
WithCorrelationID("corr-abc-123-def"). // Customer's session trace ID
WithUserID("user-789").
WithSessionID("sess-456").
WithContext(Context{
"cart_total": 299.99,
"payment_method": "credit_card",
"shipping_address": "CA, USA",
})
// Support can now:
// 1. Search logs for "corr-abc-123-def"
// 2. See all services involved (cart, payment, shipping, email)
// 3. Find exact failure point (payment gateway timeout at 14:23:45)
// 4. Trace request timeline across all microservicesExample 4 - Multi-Step Workflow:
// Use Case: Track long-running workflow (order processing)
// Keywords: workflow, saga-pattern, compensation, long-running
correlationID := uuid.New().String()
// Step 1: Reserve inventory
if err := inventoryService.Reserve(ctx, items); err != nil {
return ErrInventoryReservation.New(err).
WithCorrelationID(correlationID).
WithContext(Context{
"step": "reserve_inventory",
"workflow": "order_processing",
})
}
// Step 2: Process payment
if err := paymentService.Charge(ctx, total); err != nil {
// Compensate: unreserve inventory
inventoryService.Unreserve(ctx, items)
return ErrPaymentProcessing.New(err).
WithCorrelationID(correlationID). // Same ID for entire saga
WithContext(Context{
"step": "process_payment",
"workflow": "order_processing",
"compensation": "unreserved_inventory",
})
}
// Step 3: Schedule shipping
if err := shippingService.Schedule(ctx, address); err != nil {
// Compensate: refund payment, unreserve inventory
return ErrShippingSchedule.New(err).
WithCorrelationID(correlationID). // Track compensation actions
WithContext(Context{
"step": "schedule_shipping",
"workflow": "order_processing",
})
}
// All errors in this saga share correlationID for debuggingCommon Mistakes:
// ❌ DON'T: Generate new correlation ID at each service
err1 := ErrGateway.New().WithCorrelationID(uuid.New().String())
err2 := ErrService.New().WithCorrelationID(uuid.New().String()) // Can't trace!
// ✅ DO: Propagate same correlation ID through entire request chain
correlationID := extractOrGenerateTraceID(r)
err1 := ErrGateway.New().WithCorrelationID(correlationID)
err2 := ErrService.New().WithCorrelationID(correlationID) // Same ID
// ❌ DON'T: Use correlation ID for non-distributed systems
// (Simple monolith doesn't need correlation ID)
err := ErrSimple.New().WithCorrelationID(uuid.New().String()) // Overkill
// ✅ DO: Use correlation ID only for distributed/multi-service systems
// (In microservices)
err := ErrService.New().WithCorrelationID(traceID)Retrieval: GetCorrelationID(err) string
See Also:
WithRequestID()- For individual HTTP request trackingWithSessionID()- For user session trackingWithUserID()- For user identification
Purpose: Add unique request ID for tracking individual HTTP requests.
Why this matters:
- Request Tracing: Track single HTTP request from receipt to response
- API Debugging: Find all logs for specific API call using request ID
- Load Balancer Correlation: Match errors to load balancer logs using request ID
- Rate Limiting: Track request patterns and identify abusive clients
- Idempotency: Ensure duplicate requests with same ID are handled consistently
- API Gateway Integration: Request IDs from Kong, Nginx, AWS API Gateway automatically included
Parameters:
id string- Unique request identifier (often from X-Request-ID header or generated UUID)
Returns: errific error with request ID attached
Chaining: Can be chained with other methods
When to use:
- ✅ HTTP/REST API servers
- ✅ When using API gateways (Kong, Nginx, AWS ALB)
- ✅ For request-response debugging
- ✅ When implementing idempotency
- ✅ For rate limiting and abuse detection
When NOT to use:
- ❌ Background jobs (use job ID in context instead)
- ❌ WebSocket connections (use connection ID)
- ❌ Batch processing (use batch ID)
Example 1 - HTTP Request Tracking:
// Use Case: Track HTTP request through middleware chain
// Keywords: http, request-id, middleware, api-gateway
func requestIDMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Get or generate request ID
requestID := r.Header.Get("X-Request-ID")
if requestID == "" {
requestID = uuid.New().String()
}
// Add to response headers for client correlation
w.Header().Set("X-Request-ID", requestID)
// Store in context for downstream use
ctx := context.WithValue(r.Context(), "request_id", requestID)
next.ServeHTTP(w, r.WithContext(ctx))
})
}
func handleAPI(w http.ResponseWriter, r *http.Request) {
requestID := r.Context().Value("request_id").(string)
// Business logic
user, err := getUserFromDB(r.Context(), userID)
if err != nil {
apiErr := ErrUserNotFound.New(err).
WithRequestID(requestID). // Track this specific request
WithHTTPStatus(404).
WithContext(Context{
"user_id": userID,
"endpoint": r.URL.Path,
"method": r.Method,
})
// Client can use request ID to report issues
http.Error(w, apiErr.Error(), 404)
return
}
}
// Client sees: "user not found [X-Request-ID: req-abc-123]"
// Support searches logs: grep "req-abc-123" and finds entire request pathExample 2 - Idempotent Payment Processing:
// Use Case: Prevent duplicate payment charges using request ID
// Keywords: idempotency, payment, duplicate-prevention, request-id
func processPayment(w http.ResponseWriter, r *http.Request) {
requestID := r.Header.Get("Idempotency-Key") // Client provides
if requestID == "" {
http.Error(w, "Idempotency-Key required", 400)
return
}
// Check if this request was already processed
if result, found := idempotencyCache.Get(requestID); found {
// Return same result (already charged)
w.Write(result)
return
}
// Process payment
err := paymentGateway.Charge(amount)
if err != nil {
paymentErr := ErrPaymentFailed.New(err).
WithRequestID(requestID). // Track idempotency key
WithRetryable(false). // Don't auto-retry (already have idempotency)
WithContext(Context{
"amount": amount,
"currency": "USD",
"idempotency_key": requestID,
"duplicate_check": "passed",
})
// Store error in cache to return same error if client retries
idempotencyCache.Set(requestID, paymentErr)
http.Error(w, paymentErr.Error(), 500)
return
}
// Cache successful result
idempotencyCache.Set(requestID, result)
}Example 3 - Load Balancer Log Correlation:
// Use Case: Correlate application errors with load balancer logs
// Keywords: load-balancer, aws-alb, nginx, logging
// AWS ALB adds X-Amzn-Trace-Id header
// Nginx adds X-Request-ID header
func handleRequest(w http.ResponseWriter, r *http.Request) {
// Extract request ID from various sources
requestID := r.Header.Get("X-Request-ID") // Nginx
if requestID == "" {
requestID = r.Header.Get("X-Amzn-Trace-Id") // AWS ALB
}
if requestID == "" {
requestID = uuid.New().String() // Generate
}
err := processBusinessLogic(r.Context())
if err != nil {
appErr := ErrProcessing.New(err).
WithRequestID(requestID). // Match load balancer logs
WithHTTPStatus(500).
WithContext(Context{
"client_ip": r.RemoteAddr,
"user_agent": r.UserAgent(),
"load_balancer": "alb-prod-01",
})
// Operations team can:
// 1. Find error in app logs: grep "requestID"
// 2. Find in ALB logs: grep "X-Amzn-Trace-Id"
// 3. See client IP, timing, routing info from ALB
// 4. Correlate with WAF logs if security issue
http.Error(w, appErr.Error(), 500)
return
}
}Common Mistakes:
// ❌ DON'T: Use request ID for background jobs
func backgroundJob() {
err := ErrJob.New().WithRequestID(uuid.New().String()) // Wrong context!
}
// ✅ DO: Use context for background jobs
func backgroundJob() {
jobID := uuid.New().String()
err := ErrJob.New().WithContext(Context{"job_id": jobID})
}
// ❌ DON'T: Generate new request ID instead of using existing one
requestID := uuid.New().String() // Ignores X-Request-ID from gateway!
// ✅ DO: Extract from headers, generate only if missing
requestID := r.Header.Get("X-Request-ID")
if requestID == "" {
requestID = uuid.New().String()
}Retrieval: GetRequestID(err) string
See Also:
WithCorrelationID()- For distributed tracing across servicesWithSessionID()- For user session tracking
Purpose: Add user ID to track which user encountered the error.
Why this matters:
- User-Specific Debugging: Find all errors for a specific user when they report issues
- Support Tickets: Quickly search logs by user ID when customer contacts support
- Abuse Detection: Identify users with abnormal error rates (bots, attackers)
- Feature Rollout: Track errors during A/B testing or gradual feature rollouts
- Compliance: Required for audit trails (GDPR, HIPAA, SOC 2)
- User Impact Analysis: Determine how many users are affected by a bug
Parameters:
id string- User identifier (user ID, email, username, or external ID)
Returns: errific error with user ID attached
Chaining: Can be chained with other methods
When to use:
- ✅ Any error in user-facing features
- ✅ Authentication/authorization errors
- ✅ When user reports a bug
- ✅ For audit logging
- ✅ During A/B testing or feature flags
When NOT to use:
- ❌ System/background errors not tied to a user
- ❌ When user is not authenticated (use session ID instead)
- ❌ For sensitive PII (hash or pseudonymize first)
Example 1 - User Support Investigation:
// Use Case: User reports checkout error, support needs to debug
// Keywords: user-support, debugging, customer-service
func processCheckout(w http.ResponseWriter, r *http.Request) {
userID := GetAuthenticatedUserID(r) // From JWT/session
err := paymentService.Charge(total)
if err != nil {
checkoutErr := ErrCheckoutFailed.New(err).
WithUserID(userID). // Support can search logs by user_id
WithRequestID(requestID).
WithContext(Context{
"cart_total": total,
"payment_method": paymentMethod,
"cart_items": len(items),
})
// User reports: "Checkout failed at 2pm"
// Support searches: grep "user_id: user-123" logs.json
// Finds: All checkout attempts, cart data, payment gateway responses
http.Error(w, checkoutErr.Error(), 500)
return
}
}Example 2 - Abuse Detection:
// Use Case: Detect users generating excessive errors (bots, scrapers)
// Keywords: abuse-detection, rate-limiting, security, bot-detection
func handleAPIRequest(w http.ResponseWriter, r *http.Request) {
userID := GetUserIDFromAPIKey(r)
// Track error rate per user
err := processRequest(r)
if err != nil {
apiErr := ErrAPIRequest.New(err).
WithUserID(userID).
WithHTTPStatus(429).
WithContext(Context{
"endpoint": r.URL.Path,
"api_key_prefix": apiKey[:8],
})
// Monitor: Count errors per user_id in last 5 minutes
// If user_id "bot-456" has >1000 errors → Block API key
// If user_id "user-789" has 2 errors → Normal usage
errorRateTracker.Record(userID, apiErr)
if errorRateTracker.IsAbusing(userID) {
// Block user
return
}
http.Error(w, apiErr.Error(), 429)
return
}
}Example 3 - A/B Testing Impact Analysis:
// Use Case: New feature causes errors for some users, track impact
// Keywords: ab-testing, feature-flags, gradual-rollout, impact-analysis
func handleFeatureRequest(w http.ResponseWriter, r *http.Request) {
userID := GetUserID(r)
// Feature flag: 10% of users get new algorithm
if featureFlags.IsEnabled("new-search-v2", userID) {
err := newSearchAlgorithmV2(query)
if err != nil {
return ErrSearch.New(err).
WithUserID(userID). // Track which users hit errors
WithContext(Context{
"feature_flag": "new-search-v2",
"rollout_percentage": 10,
"algorithm_version": "v2",
"query": query,
})
// Analysis: Search for user_ids in "new-search-v2" cohort
// Result: 50 unique user_ids affected = 50 users hit bug
// Decision: Roll back feature (too many errors)
}
}
}Example 4 - GDPR Compliance Audit Trail:
// Use Case: Track data access and errors for compliance audit
// Keywords: gdpr, compliance, audit-trail, data-privacy
func exportUserData(w http.ResponseWriter, r *http.Request) {
userID := GetUserID(r)
adminID := GetAdminID(r) // Who initiated export
// User data export (GDPR Right to Data Portability)
data, err := database.ExportUserData(userID)
if err != nil {
exportErr := ErrDataExport.New(err).
WithUserID(userID). // Which user's data
WithContext(Context{
"admin_id": adminID, // Who tried to export
"export_type": "gdpr_request",
"data_size_mb": 0, // Failed, no data
"ip_address": r.RemoteAddr, // Where request came from
"timestamp": time.Now().Unix(),
})
// Audit log must track:
// - Which user's data was accessed/failed (user_id)
// - Who accessed it (admin_id)
// - When (timestamp)
// - Result (error or success)
auditLog.Record(exportErr)
http.Error(w, "Export failed", 500)
return
}
}Common Mistakes:
// ❌ DON'T: Store email addresses directly (PII risk)
err := ErrFailed.New().WithUserID("user@example.com") // PII!
// ✅ DO: Use internal user ID or hash
err := ErrFailed.New().WithUserID("user-abc-123")
// ❌ DON'T: Add user ID to system errors
err := ErrDatabaseMigration.New().WithUserID(userID) // System error, no user
// ✅ DO: Add user ID only to user-initiated operations
err := ErrUserProfile.New().WithUserID(userID)
// ❌ DON'T: Use user ID when user is not authenticated
err := ErrPublicAPI.New().WithUserID("unknown") // Meaningless
// ✅ DO: Use session ID for unauthenticated users
err := ErrPublicAPI.New().WithSessionID(sessionID)Retrieval: GetUserID(err) string
See Also:
WithSessionID()- For unauthenticated user trackingWithCorrelationID()- For tracing requests across services
Purpose: Add session ID for tracking unauthenticated users and user sessions.
Why this matters:
- Anonymous User Tracking: Track errors for users who aren't logged in
- Session Debugging: Debug issues within a specific browsing session
- Conversion Funnel Analysis: Track errors through signup/checkout flows
- Session Replay: Link errors to session replay tools (FullStory, LogRocket)
- Bot Detection: Identify bot sessions vs human sessions
- Multi-Tab Issues: Debug errors when user has multiple tabs open
Parameters:
id string- Session identifier (from cookie, JWT, or generated)
Returns: errific error with session ID attached
Chaining: Can be chained with other methods
When to use:
- ✅ Unauthenticated users (before login)
- ✅ Signup/registration flows
- ✅ Guest checkout processes
- ✅ Public-facing pages with errors
- ✅ When integrating with session replay tools
When NOT to use:
- ❌ When you already have user ID (use WithUserID instead)
- ❌ API-only services without sessions
- ❌ Background jobs
Example 1 - Guest Checkout Debugging:
// Use Case: Track checkout errors for users who aren't logged in
// Keywords: guest-checkout, unauthenticated, e-commerce, conversion-funnel
func guestCheckout(w http.ResponseWriter, r *http.Request) {
sessionID := GetSessionID(r) // From cookie
// Guest user (not logged in) tries to checkout
err := processGuestCheckout(cart, shippingInfo)
if err != nil {
checkoutErr := ErrGuestCheckout.New(err).
WithSessionID(sessionID). // Track anonymous user's session
WithContext(Context{
"cart_total": cart.Total,
"cart_items": len(cart.Items),
"shipping_country": shippingInfo.Country,
"user_type": "guest",
})
// Support can:
// 1. Search logs by session_id
// 2. See full checkout flow: cart → shipping → payment → ERROR
// 3. Identify if error affects multiple sessions (systemic issue)
http.Error(w, checkoutErr.Error(), 500)
return
}
}Example 2 - Signup Flow Analysis:
// Use Case: Track errors during user registration process
// Keywords: signup-flow, registration, user-onboarding, conversion
func handleSignup(w http.ResponseWriter, r *http.Request) {
sessionID := GetSessionID(r) // Session started when user visited landing page
// Validate email
if !isValidEmail(email) {
return ErrInvalidEmail.New().
WithSessionID(sessionID). // Track which signup session
WithHTTPStatus(400).
WithRetryable(false).
WithContext(Context{
"step": "email_validation",
"signup_source": "google_ads", // How they found us
"referrer": r.Referer(),
})
}
// Create account
err := createUserAccount(email, password)
if err != nil {
return ErrSignupFailed.New(err).
WithSessionID(sessionID). // Same session through whole flow
WithContext(Context{
"step": "account_creation",
"email_domain": getDomain(email),
})
}
// Analysis: Count errors by session_id in signup funnel
// Landing page → Email form → Password form → ERROR
// Find: 30% of sessions with .edu emails fail at password step
// Fix: Improve password requirements UI
}Example 3 - Session Replay Integration:
// Use Case: Link errors to FullStory/LogRocket session recordings
// Keywords: session-replay, fullstory, logrocket, user-experience
func handleAction(w http.ResponseWriter, r *http.Request) {
sessionID := GetSessionID(r)
replayURL := GetSessionReplayURL(sessionID) // From FullStory SDK
err := performAction(r)
if err != nil {
actionErr := ErrActionFailed.New(err).
WithSessionID(sessionID).
WithContext(Context{
"session_replay_url": replayURL, // Link to video
"user_agent": r.UserAgent(),
"viewport_width": r.Header.Get("X-Viewport-Width"),
})
// Support engineer:
// 1. Sees error in logs with session_id
// 2. Clicks session_replay_url
// 3. Watches video of user's exact actions leading to error
// 4. Sees UI state, network requests, console errors
http.Error(w, actionErr.Error(), 500)
return
}
}Example 4 - Multi-Tab Session Debugging:
// Use Case: User has multiple tabs open, causing conflicts
// Keywords: multi-tab, session-conflicts, race-conditions
func updateCart(w http.ResponseWriter, r *http.Request) {
sessionID := GetSessionID(r)
tabID := r.Header.Get("X-Tab-ID") // Track which browser tab
// User opened 2 tabs, both trying to modify cart simultaneously
err := cartService.UpdateItem(sessionID, itemID, quantity)
if err != nil {
return ErrCartConflict.New(err).
WithSessionID(sessionID). // Same session
WithContext(Context{
"tab_id": tabID, // Different tabs!
"conflict_type": "concurrent_modification",
"item_id": itemID,
})
// Debug logs show:
// session_id: sess-123, tab_id: tab-A → Updated item X to qty 2
// session_id: sess-123, tab_id: tab-B → Updated item X to qty 5 (CONFLICT!)
// Fix: Add optimistic locking or merge tab changes
}
}Common Mistakes:
// ❌ DON'T: Use session ID when you have user ID
func authenticatedAction(r *http.Request) error {
sessionID := GetSessionID(r)
userID := GetUserID(r) // User is logged in!
err := ErrAction.New().WithSessionID(sessionID) // Should use userID!
}
// ✅ DO: Use user ID for authenticated users
func authenticatedAction(r *http.Request) error {
userID := GetUserID(r)
err := ErrAction.New().WithUserID(userID)
}
// ❌ DON'T: Generate new session ID on every request
sessionID := uuid.New().String() // Can't track across requests!
// ✅ DO: Use persistent session ID from cookie/JWT
sessionID := GetSessionIDFromCookie(r)
// ✅ ALTERNATIVE: Include both if useful
err := ErrAction.New().
WithUserID(userID). // Who the user is
WithSessionID(sessionID) // Which login sessionRetrieval: GetSessionID(err) string
See Also:
WithUserID()- For authenticated user trackingWithRequestID()- For individual request tracking
Purpose: Add human-readable explanation of what went wrong and why.
Why this matters:
- LLM Understanding: AI agents read help text to understand error context without parsing code
- User Communication: Clear explanation improves user experience (avoid cryptic errors)
- Automated Recovery: LLMs use help text to decide recovery strategy
- Reduced Support Tickets: Good help text answers user questions before they contact support
- Faster Debugging: Developers understand issue immediately without checking code
- Self-Service: Users can often fix issues themselves with good help text
Parameters:
message string- Clear, user-friendly explanation of what went wrong
Returns: errific error with help message attached
Chaining: Can be chained with other methods
When to use:
- ✅ MCP tool servers (LLMs need context)
- ✅ User-facing errors (improve UX)
- ✅ Complex failure scenarios (explain non-obvious causes)
- ✅ Resource exhaustion (explain why limit was hit)
- ✅ Configuration errors (explain what's misconfigured)
When NOT to use:
- ❌ When error message is already clear
- ❌ For internal errors users won't see
- ❌ When it duplicates the error message
Example 1 - Database Connection Pool Exhausted:
// Use Case: Explain resource exhaustion to LLM/user
// Keywords: database, connection-pool, resource-exhaustion, help
err := ErrDatabaseConnection.New(dbErr).
WithHelp("The database connection pool is full because too many requests are running simultaneously. All 100 connections are in use.").
WithSuggestion("Wait a few seconds for connections to be released, or reduce the number of concurrent requests.").
WithRetryable(true).
WithRetryAfter(10 * time.Second).
WithContext(Context{
"pool_size": 100,
"active_connections": 100,
"waiting_requests": 50,
})
// LLM reads help and understands:
// - Problem: Pool is full (not a network issue, not credentials)
// - Cause: Too many concurrent requests
// - Action: Wait for connections to free up (from suggestion)Example 2 - API Rate Limit for LLM:
// Use Case: Explain rate limit to LLM tool caller
// Keywords: rate-limit, api-quota, llm-integration, mcp
err := ErrAPIRateLimit.New().
WithMCPCode(errific.MCPResourceError).
WithHelp("You've exceeded the API rate limit of 1000 requests per hour. The limit resets at the top of each hour.").
WithSuggestion("Wait 45 minutes until the rate limit resets at 3:00 PM, or upgrade to a higher tier plan for more requests.").
WithDocs("https://docs.example.com/api/rate-limits").
WithRetryable(true).
WithRetryAfter(45 * time.Minute).
WithContext(Context{
"rate_limit": 1000,
"requests_made": 1000,
"reset_time": "2024-01-15T15:00:00Z",
"current_tier": "free",
})
// LLM Decision:
// - Reads help: "Rate limit exceeded, resets at 3:00 PM"
// - Reads suggestion: "Wait 45 minutes OR upgrade plan"
// - Decides: Inform user and wait (don't spam retries)Example 3 - Configuration Error:
// Use Case: Explain misconfiguration clearly
// Keywords: configuration, setup-error, validation
err := ErrInvalidConfig.New().
WithHelp("The S3 bucket name 'my bucket' contains spaces, which is not allowed by AWS. Bucket names must only contain lowercase letters, numbers, and hyphens.").
WithSuggestion("Change the bucket name to 'my-bucket' in your configuration file (config.yaml, line 23).").
WithDocs("https://docs.aws.amazon.com/s3/bucket-naming-rules").
WithRetryable(false).
WithContext(Context{
"config_file": "config.yaml",
"config_line": 23,
"invalid_bucket_name": "my bucket",
"suggested_bucket_name": "my-bucket",
})
// Developer reads help and immediately knows:
// - What's wrong: Spaces in bucket name
// - Why it's wrong: AWS doesn't allow it
// - How to fix: Replace with hyphens (from suggestion)
// - Where to fix: config.yaml line 23 (from context)Example 4 - Authentication Failure:
// Use Case: Explain auth failure without exposing security details
// Keywords: authentication, security, user-error
err := ErrAuthFailed.New().
WithHelp("Your API key is invalid or has expired. API keys are valid for 90 days from creation.").
WithSuggestion("Generate a new API key from your account dashboard at https://example.com/dashboard/api-keys").
WithDocs("https://docs.example.com/authentication").
WithHTTPStatus(401).
WithRetryable(false).
WithContext(Context{
"auth_method": "api_key",
"key_age_days": 95, // Expired
})
// User sees helpful error without security risk:
// - Clear problem: Key expired (not "invalid credentials")
// - Clear action: Generate new key
// - No sensitive info: Doesn't reveal valid key format or DB detailsBest Practices:
// ✅ DO: Explain the problem clearly
WithHelp("The payment gateway timed out after 30 seconds waiting for a response")
// ❌ DON'T: Repeat the error message
WithHelp("Payment timeout") // Already in error message!
// ✅ DO: Provide context about WHY it failed
WithHelp("The file upload failed because the file size (50MB) exceeds the 10MB limit")
// ❌ DON'T: Be vague or generic
WithHelp("An error occurred") // Useless!
// ✅ DO: Include relevant numbers/thresholds
WithHelp("Database connection pool is exhausted (100/100 connections active)")
// ❌ DON'T: Include technical jargon for user-facing errors
WithHelp("ECONNREFUSED on socket descriptor 42") // Too technical for users
// ✅ DO: Explain in plain language
WithHelp("Unable to connect to the database server. The server may be down or unreachable.")Retrieval: GetHelp(err) string
See Also:
WithSuggestion()- Add actionable recovery stepsWithDocs()- Add link to documentationWithMCPCode()- For LLM tool integration
Purpose: Add actionable suggestion for how to fix or recover from the error.
Why this matters:
- Automated Recovery: LLMs read suggestions and take action automatically
- Reduced Downtime: Clear recovery steps enable faster resolution
- Self-Service: Users fix issues themselves without contacting support
- Developer Productivity: Immediate guidance saves debugging time
- Reduced Support Load: Good suggestions prevent support tickets
- AI Decision-Making: LLMs use suggestions to choose between retry, fix params, or abort
Parameters:
message string- Specific, actionable steps to resolve the error
Returns: errific error with suggestion attached
Chaining: Can be chained with other methods
When to use:
- ✅ When there's a clear recovery action
- ✅ For user-correctable errors (invalid input, config issues)
- ✅ MCP tool servers (LLMs need actionable guidance)
- ✅ When you want to guide users to self-resolution
- ✅ For common problems with known solutions
When NOT to use:
- ❌ When there's no recovery action (permanent failures)
- ❌ For internal errors users can't fix
- ❌ When the action isn't clear or specific
Example 1 - Invalid API Parameters (LLM Can Fix):
// Use Case: LLM provided wrong parameter type, guide it to fix
// Keywords: mcp, parameter-validation, llm-recovery, automated-fix
err := ErrInvalidParams.New().
WithMCPCode(errific.MCPInvalidParams).
WithHelp("The 'limit' parameter must be a number between 1 and 100, but you provided 'unlimited'.").
WithSuggestion("Change the 'limit' parameter to a number like 10, 50, or 100.").
WithDocs("https://docs.example.com/api/parameters#limit").
WithContext(Context{
"parameter": "limit",
"provided_value": "unlimited",
"expected_type": "integer",
"valid_range": "1-100",
})
// LLM reads suggestion and:
// 1. Understands it needs to change "unlimited" to a number
// 2. Picks a reasonable default (e.g., 50)
// 3. Retries request with {"limit": 50}
// 4. Succeeds automatically without human interventionExample 2 - Rate Limit with Specific Action:
// Use Case: Tell LLM exactly when to retry
// Keywords: rate-limit, retry-timing, specific-action
err := ErrRateLimit.New().
WithHelp("You've made 1000 API requests in the last hour, exceeding your limit.").
WithSuggestion("Wait 15 minutes until 3:00 PM when your rate limit resets, then retry this request.").
WithRetryable(true).
WithRetryAfter(15 * time.Minute).
WithContext(Context{
"requests_made": 1000,
"rate_limit": 1000,
"reset_time": time.Now().Add(15 * time.Minute).Format(time.RFC3339),
})
// LLM reads suggestion and:
// 1. Knows to wait (not retry immediately)
// 2. Knows exact time to retry (3:00 PM)
// 3. Can inform user: "I'll retry in 15 minutes when limit resets"Example 3 - Missing Required Field:
// Use Case: Guide user to provide missing data
// Keywords: validation, missing-field, user-input
err := ErrMissingField.New().
WithHelp("The 'email' field is required but was not provided.").
WithSuggestion("Please provide your email address in the 'email' field.").
WithHTTPStatus(400).
WithRetryable(false). // Can't retry without fixing input
WithContext(Context{
"missing_field": "email",
"required_fields": []string{"email", "password", "name"},
"provided_fields": []string{"password", "name"},
})
// User sees:
// - Help: "email field is required"
// - Suggestion: "provide your email address"
// - Context: Shows they provided password and name but not email
// → User adds email and retries successfullyExample 4 - File Too Large:
// Use Case: Guide user to compress or split file
// Keywords: file-upload, size-limit, compression
err := ErrFileTooLarge.New().
WithHelp("The file you're uploading is 50MB, which exceeds the 10MB limit.").
WithSuggestion("Compress the file to reduce its size below 10MB, or split it into smaller files.").
WithHTTPStatus(413).
WithRetryable(false).
WithContext(Context{
"file_size_mb": 50,
"max_size_mb": 10,
"file_name": "presentation.pptx",
"suggested_action": "compress",
})
// User reads suggestion and has options:
// 1. Compress the PowerPoint file
// 2. Split into multiple files
// 3. Knows exact limit (10MB) for referenceExample 5 - Configuration Fix Location:
// Use Case: Tell developer exactly where and how to fix config
// Keywords: configuration, developer-guidance, specific-fix
err := ErrInvalidConfig.New().
WithHelp("The database connection string is missing the port number.").
WithSuggestion("Add the port number to the connection string in config.yaml line 15. Example: 'postgres://localhost:5432/mydb'").
WithDocs("https://docs.example.com/configuration/database").
WithContext(Context{
"config_file": "config.yaml",
"config_line": 15,
"current_value": "postgres://localhost/mydb",
"expected_format": "postgres://localhost:5432/mydb",
})
// Developer knows:
// - What to fix: Add port number
// - Where to fix: config.yaml line 15
// - How to fix: Example provided ("localhost:5432")
// - Can copy/paste the example formatBest Practices:
// ✅ DO: Be specific and actionable
WithSuggestion("Increase the 'max_connections' setting to 200 in postgresql.conf")
// ❌ DON'T: Be vague
WithSuggestion("Fix the database configuration") // HOW?
// ✅ DO: Provide multiple options when applicable
WithSuggestion("Either compress the file below 10MB, split it into smaller files, or upgrade to Pro plan for 100MB uploads")
// ❌ DON'T: Suggest impossible actions
WithSuggestion("Contact the administrator") // User may not have admin access
// ✅ DO: Include examples when helpful
WithSuggestion("Use ISO 8601 format for dates. Example: '2024-01-15T10:30:00Z'")
// ❌ DON'T: Repeat the help message
WithHelp("Rate limit exceeded")
WithSuggestion("You exceeded the rate limit") // Duplicate!
// ✅ DO: Tell WHEN to retry if relevant
WithSuggestion("Retry in 5 seconds after connections are released")
// ✅ DO: Provide links to self-service actions
WithSuggestion("Generate a new API key at https://example.com/dashboard/api-keys")Retrieval: GetSuggestion(err) string
See Also:
WithHelp()- Explain what went wrongWithDocs()- Link to documentationWithRetryable()- Indicate if retry is appropriate
Purpose: Add link to documentation for detailed information about the error.
Why this matters:
- Self-Service Support: Users can read docs instead of contacting support
- LLM Context Expansion: AI agents can fetch and read docs to understand complex issues
- Onboarding: New developers learn about features through error-linked docs
- Comprehensive Guidance: Docs provide more detail than error messages can include
- Updated Information: Docs can be updated without code changes
- SEO and Discoverability: Error docs help users find solutions via search engines
Parameters:
url string- URL to documentation page related to this error
Returns: errific error with documentation URL attached
Chaining: Can be chained with other methods
When to use:
- ✅ Complex features that need detailed explanation
- ✅ MCP tool servers (LLMs can fetch docs)
- ✅ API errors (link to API reference)
- ✅ Configuration errors (link to config guide)
- ✅ Rate limits and quotas (link to pricing/limits page)
When NOT to use:
- ❌ When no relevant documentation exists
- ❌ For internal errors (docs won't help)
- ❌ When linking to generic homepage (be specific)
Example 1 - API Authentication Documentation:
// Use Case: Link to auth docs when API key is invalid
// Keywords: authentication, api-docs, self-service
err := ErrInvalidAPIKey.New().
WithHelp("Your API key is invalid or has expired.").
WithSuggestion("Generate a new API key from your dashboard at https://example.com/dashboard").
WithDocs("https://docs.example.com/authentication/api-keys"). // Detailed auth guide
WithHTTPStatus(401).
WithRetryable(false).
WithContext(Context{
"auth_method": "api_key",
"key_format": "sk_live_...",
})
// User clicks docs link and finds:
// - How API keys work
// - How to generate new keys
// - Key rotation best practices
// - Security recommendations
// - Troubleshooting common issuesExample 2 - MCP Tool Error with Docs:
// Use Case: LLM can fetch docs to understand tool usage
// Keywords: mcp, llm-integration, tool-documentation
err := ErrToolNotFound.New().
WithMCPCode(errific.MCPMethodNotFound).
WithHelp("The tool 'search_web' is not available in this server.").
WithSuggestion("Use the 'list_tools' method to see all available tools.").
WithDocs("https://docs.example.com/mcp/tools"). // List of all tools
WithContext(Context{
"requested_tool": "search_web",
"available_tools": []string{"search_db", "send_email", "create_calendar_event"},
})
// LLM can:
// 1. Read the help message (tool not found)
// 2. Fetch docs URL to see all available tools
// 3. Find "search_db" tool as alternative
// 4. Use search_db instead of search_web
// 5. Succeed automaticallyExample 3 - Rate Limit Documentation:
// Use Case: Link to rate limits and pricing page
// Keywords: rate-limit, pricing, quota, upgrade
err := ErrRateLimit.New().
WithHelp("You've exceeded the free tier limit of 1000 requests per day.").
WithSuggestion("Upgrade to Pro plan for 100,000 requests per day at https://example.com/pricing").
WithDocs("https://docs.example.com/api/rate-limits"). // Detailed rate limit guide
WithHTTPStatus(429).
WithRetryable(true).
WithRetryAfter(24 * time.Hour).
WithContext(Context{
"current_tier": "free",
"daily_limit": 1000,
"requests_today": 1000,
"upgrade_url": "https://example.com/pricing",
})
// Docs page explains:
// - Rate limits for each tier (Free, Pro, Enterprise)
// - How limits are calculated (per day, per hour, per endpoint)
// - What happens when you exceed limits
// - How to upgrade to higher tier
// - Best practices for staying within limitsExample 4 - Configuration Error with Schema Docs:
// Use Case: Link to configuration schema documentation
// Keywords: configuration, schema, yaml, validation
err := ErrInvalidConfig.New().
WithHelp("The configuration file has invalid YAML syntax at line 23.").
WithSuggestion("Check the YAML syntax and ensure all quotes are closed.").
WithDocs("https://docs.example.com/configuration/schema"). // Config schema reference
WithContext(Context{
"config_file": "config.yaml",
"error_line": 23,
"error_column": 15,
"syntax_error": "unclosed string",
})
// Docs page provides:
// - Complete configuration schema
// - Example config files
// - Description of each field
// - Validation rules
// - Common configuration mistakesExample 5 - Feature-Specific Error:
// Use Case: Link to feature documentation for complex feature
// Keywords: feature-docs, onboarding, learning
err := ErrWebhookValidation.New().
WithHelp("The webhook signature validation failed. The signature in the X-Webhook-Signature header doesn't match the computed signature.").
WithSuggestion("Ensure you're using the correct webhook secret from your dashboard and following the signature algorithm described in the docs.").
WithDocs("https://docs.example.com/webhooks/signature-validation"). // Detailed webhook guide
WithContext(Context{
"webhook_id": "wh_123",
"signature_header": "X-Webhook-Signature",
"algorithm": "HMAC-SHA256",
})
// Docs explain:
// - How webhook signatures work
// - Step-by-step signature computation
// - Code examples in multiple languages
// - Common signature validation mistakes
// - How to test webhooks locallyBest Practices:
// ✅ DO: Link to specific, relevant docs
WithDocs("https://docs.example.com/api/authentication/api-keys#rotation")
// ❌ DON'T: Link to generic homepage
WithDocs("https://example.com") // Not helpful!
// ✅ DO: Use deep links to exact section
WithDocs("https://docs.example.com/errors#rate-limit-exceeded")
// ❌ DON'T: Link to 404 pages
WithDocs("https://docs.example.com/old-page") // Check links!
// ✅ DO: Include anchor links for long pages
WithDocs("https://docs.example.com/configuration#database-connection-pool")
// ✅ DO: Version docs links if API is versioned
WithDocs("https://docs.example.com/v2/api/errors") // Not v1
// ✅ DO: Use stable URLs that won't break
WithDocs("https://docs.example.com/permanent/api-keys") // Permanent path
// ❌ DON'T: Link to docs behind authentication
WithDocs("https://internal.example.com/docs") // Users can't access!Retrieval: GetDocs(err) string
See Also:
WithHelp()- Explain the problemWithSuggestion()- Provide actionable stepsWithMCPCode()- For LLM tool integration
Purpose: Add semantic tags for error categorization and RAG system retrieval.
Why this matters:
- RAG Optimization: Tags improve error searchability in RAG/AI systems
- Log Filtering: Filter logs by tags (e.g., "show all 'payment' errors")
- Metric Aggregation: Group errors by tags for dashboards (count by "database" tag)
- Alert Routing: Route specific tagged errors to specialized teams
- Semantic Search: Find related errors using semantic similarity of tags
- Multi-Dimensional Categorization: Tags provide flexible categorization beyond single category field
Parameters:
tags ...string- Variable number of semantic tags (e.g., "payment", "critical", "external-api")
Returns: errific error with tags attached
Chaining: Can be chained with other methods
When to use:
- ✅ For multi-dimensional error classification
- ✅ When building RAG/AI systems that search errors
- ✅ For flexible log filtering and aggregation
- ✅ When routing errors to different teams/systems
- ✅ For semantic search across errors
When NOT to use:
- ❌ For single-dimension classification (use WithCategory instead)
- ❌ For structured data (use WithContext instead)
- ❌ When you need exact key-value pairs (use WithLabels instead)
Example 1 - Multi-Dimensional Classification:
// Use Case: Tag error with multiple relevant dimensions
// Keywords: tagging, classification, filtering, rag
err := ErrPaymentFailed.New(gatewayErr).
WithTags("payment", "external-api", "critical", "retryable", "user-facing").
WithCategory(CategoryServer). // Primary category
WithHTTPStatus(503).
WithRetryable(true).
WithContext(Context{
"gateway": "stripe",
"amount": 99.99,
})
// Now searchable by:
// - "payment" tag → Finds all payment errors
// - "external-api" tag → Finds all third-party API errors
// - "critical" tag → Finds high-priority errors
// - "retryable" tag → Finds errors that can be retried
// - Multiple tags → "payment AND critical" → Payment errors that are criticalExample 2 - Team Routing:
// Use Case: Route errors to appropriate teams based on tags
// Keywords: alert-routing, team-ownership, on-call
err := ErrDatabaseQuery.New(sqlErr).
WithTags("database", "postgresql", "performance", "backend-team").
WithContext(Context{
"query": sql,
"duration_ms": 5000, // Slow query
"table": "orders",
})
// Alert router reads tags:
if hasTag(err, "database") && hasTag(err, "backend-team") {
alertSystem.NotifyTeam("backend-oncall", err)
}
// Alternative: Frontend error
err2 := ErrUIRender.New().
WithTags("frontend", "react", "rendering", "frontend-team").
WithContext(Context{"component": "CheckoutForm"})
if hasTag(err2, "frontend-team") {
alertSystem.NotifyTeam("frontend-oncall", err2)
}Example 3 - RAG System Semantic Search:
// Use Case: Enable semantic search across errors for AI agents
// Keywords: rag, semantic-search, ai-retrieval, embeddings
// Error 1: Payment gateway timeout
err1 := ErrPaymentTimeout.New().
WithTags("payment", "timeout", "stripe", "network", "checkout").
WithHelp("Payment gateway did not respond within 30 seconds")
// Error 2: Database connection timeout
err2 := ErrDatabaseTimeout.New().
WithTags("database", "timeout", "postgres", "network", "connection-pool").
WithHelp("Database connection timed out after 5 seconds")
// RAG System Query: "Find all timeout errors"
// Returns: Both err1 and err2 (both have "timeout" tag)
// RAG System Query: "Find payment-related errors"
// Returns: err1 only (has "payment" tag)
// RAG System Query: "Find network issues"
// Returns: Both err1 and err2 (both have "network" tag)Example 4 - Dashboard Metrics:
// Use Case: Aggregate error counts by tags for monitoring dashboard
// Keywords: metrics, monitoring, dashboard, aggregation
// Various errors with tags
err1 := ErrAPICall.New().WithTags("api", "external", "retryable")
err2 := ErrValidation.New().WithTags("validation", "user-input", "non-retryable")
err3 := ErrDatabase.New().WithTags("database", "internal", "retryable")
// Dashboard queries:
// COUNT(errors WHERE tag = "retryable") = 2
// COUNT(errors WHERE tag = "external") = 1
// COUNT(errors WHERE tag = "user-input") = 1
// Advanced dashboard:
// - "Retryable errors by hour" (filter by "retryable" tag)
// - "External API errors" (filter by "external" tag)
// - "User-facing errors" (filter by "user-input" tag)Example 5 - MCP Tool Server Tagging:
// Use Case: Tag MCP tool errors for LLM categorization
// Keywords: mcp, llm, tool-server, semantic-tagging
err := ErrToolExecution.New(execErr).
WithMCPCode(errific.MCPToolError).
WithTags("mcp", "tool-execution", "database", "search", "transient").
WithHelp("Database search tool failed due to connection timeout").
WithSuggestion("Retry the search in a few seconds").
WithRetryable(true).
WithContext(Context{
"tool_name": "search_database",
"search_query": "find users",
})
// LLM reads tags and understands:
// - "mcp" → This is an MCP tool error
// - "tool-execution" → Failed during execution (not parameter validation)
// - "database" → Related to database operations
// - "search" → Specifically a search operation
// - "transient" → Temporary issue (can retry)Best Practices:
// ✅ DO: Use lowercase, hyphenated tags
WithTags("user-input", "rate-limit", "external-api")
// ❌ DON'T: Use mixed case or spaces
WithTags("User Input", "RateLimit") // Inconsistent!
// ✅ DO: Use multiple specific tags
WithTags("payment", "stripe", "timeout", "checkout")
// ❌ DON'T: Use single vague tag
WithTags("error") // Useless!
// ✅ DO: Include team/ownership tags
WithTags("database", "backend-team", "postgres")
// ✅ DO: Include severity/priority tags when relevant
WithTags("critical", "high-priority", "user-facing")
// ❌ DON'T: Duplicate information from other fields
WithTags("retryable") // Already have WithRetryable(true)!
// Better: Use tags for additional dimensions
// ✅ DO: Use consistent tag vocabulary across codebase
// Define standard tags: "payment", "database", "network", "validation"Retrieval: GetTags(err) []string
See Also:
WithLabel()/WithLabels()- For key-value pairsWithCategory()- For primary error classificationWithContext()- For structured metadata
Purpose: Add single key-value label for structured error classification.
Why this matters:
- Structured Queries: Query errors by exact key-value pairs (e.g., "severity=high")
- Prometheus/OpenTelemetry: Labels map directly to metric labels
- Datadog/APM Integration: Labels become tags in monitoring systems
- Faceted Search: Filter errors by multiple label dimensions
- Cardinality Control: Labels are better than tags for high-cardinality data
- Type Safety: Key-value structure enforces consistent labeling
Parameters:
key string- Label key (e.g., "severity", "team", "region")value string- Label value (e.g., "high", "backend", "us-east-1")
Returns: errific error with label attached
Chaining: Can be chained with other methods (call multiple times for multiple labels)
When to use:
- ✅ For key-value metadata (severity, region, team, version)
- ✅ When integrating with Prometheus, OpenTelemetry, Datadog
- ✅ For structured filtering and aggregation
- ✅ When you need exact key-value matching
When NOT to use:
- ❌ For freeform tags (use WithTags instead)
- ❌ For complex structured data (use WithContext instead)
- ❌ For temporary debugging info (use WithContext instead)
Example 1 - Error Severity Labeling:
// Use Case: Label errors by severity for alerting
// Keywords: severity, priority, alerting, filtering
// Critical error
err1 := ErrPaymentGatewayDown.New().
WithLabel("severity", "critical").
WithLabel("team", "payments").
WithLabel("region", "us-east-1").
WithRetryable(false)
// Warning error
err2 := ErrSlowQuery.New().
WithLabel("severity", "warning").
WithLabel("team", "backend").
WithLabel("query_type", "analytics")
// Alert system:
if GetLabelValue(err, "severity") == "critical" {
pagerduty.Alert(GetLabelValue(err, "team"))
}Example 2 - Prometheus Metrics Integration:
// Use Case: Export error metrics to Prometheus with labels
// Keywords: prometheus, metrics, observability, monitoring
err := ErrAPICall.New(httpErr).
WithLabel("endpoint", "/api/users").
WithLabel("method", "GET").
WithLabel("status", "500").
WithLabel("region", "us-west-2").
WithContext(Context{
"duration_ms": 1500,
"user_id": "user-123",
})
// Prometheus metric:
// api_errors_total{endpoint="/api/users",method="GET",status="500",region="us-west-2"} 1
// Prometheus query examples:
// rate(api_errors_total{status="500"}[5m]) → 500 errors per second
// sum by (endpoint) (api_errors_total) → Errors grouped by endpoint
// api_errors_total{region="us-west-2"} → Errors in specific regionExample 3 - Multi-Tenant Error Tracking:
// Use Case: Track errors per tenant/customer
// Keywords: multi-tenant, saas, customer-tracking
err := ErrQuotaExceeded.New().
WithLabel("tenant_id", "tenant-abc-123").
WithLabel("plan", "free").
WithLabel("resource", "api_requests").
WithContext(Context{
"quota_limit": 1000,
"current_usage": 1000,
})
// Support query: Find all errors for tenant "tenant-abc-123"
// Billing query: Count errors by "plan" label
// Resource query: Find quota errors for "api_requests" resourceExample 4 - Deployment Version Tracking:
// Use Case: Track errors by deployment version for rollback decisions
// Keywords: deployment, version-tracking, rollback, canary
err := ErrFeatureExecution.New(featureErr).
WithLabel("version", "v2.5.0").
WithLabel("deployment", "canary").
WithLabel("feature_flag", "new-checkout-flow").
WithContext(Context{
"deployed_at": deployTime,
"commit_sha": "abc123",
})
// Deployment dashboard:
// - Count errors in v2.5.0 vs v2.4.0
// - Compare canary deployment vs production
// - Decide: Too many errors in v2.5.0 → Rollback!Common Mistakes:
// ❌ DON'T: Use labels for high-cardinality data
err := ErrAPI.New().WithLabel("user_id", "user-123-456-789") // Millions of users!
// ✅ DO: Use labels for low-cardinality dimensions
err := ErrAPI.New().WithLabel("region", "us-east-1") // Only ~20 regions
// ❌ DON'T: Duplicate context data in labels
err := ErrDB.New().
WithContext(Context{"query": sql}).
WithLabel("query", sql) // Duplicate!
// ✅ DO: Use labels for classification, context for details
err := ErrDB.New().
WithContext(Context{"query": sql}). // Detailed query
WithLabel("query_type", "analytics") // Classification
// ❌ DON'T: Use inconsistent label names
err1 := ErrAPI.New().WithLabel("severity", "high")
err2 := ErrAPI.New().WithLabel("priority", "high") // Different key!
// ✅ DO: Use consistent label keys across codebase
err1 := ErrAPI.New().WithLabel("severity", "high")
err2 := ErrDB.New().WithLabel("severity", "critical") // Same keyRetrieval: GetLabelValue(err, key string) string, GetLabels(err) map[string]string
See Also:
WithLabels()- Add multiple labels at onceWithTags()- For freeform semantic tagsWithContext()- For detailed structured data
Purpose: Add multiple key-value labels at once (batch version of WithLabel).
Why this matters:
- Convenience: Set multiple labels in one call
- Consistency: Ensures all related labels are set together
- Prometheus/APM: Matches metric label pattern (map of key-values)
- Template Reuse: Define label sets once and reuse across errors
- Structured Queries: Enable complex multi-dimensional filtering
Parameters:
labels map[string]string- Map of label key-value pairs
Returns: errific error with all labels attached
Chaining: Can be chained with other methods
Example 1 - Standard Label Template:
// Use Case: Reuse common label sets across application
// Keywords: template, consistency, reusability
// Define standard label templates
var standardLabels = map[string]string{
"service": "api-gateway",
"environment": "production",
"region": "us-east-1",
"version": "v2.1.0",
}
// Apply to all errors
err := ErrAPICall.New(httpErr).
WithLabels(standardLabels). // Batch apply all labels
WithLabel("endpoint", "/users"). // Add specific label
WithHTTPStatus(500)
// All errors now have consistent base labelsExample 2 - OpenTelemetry Span Labels:
// Use Case: Match OpenTelemetry span attributes in errors
// Keywords: opentelemetry, tracing, observability
func handleRequest(ctx context.Context) error {
span := trace.SpanFromContext(ctx)
// Extract span attributes as labels
spanLabels := map[string]string{
"trace_id": span.SpanContext().TraceID().String(),
"span_id": span.SpanContext().SpanID().String(),
"service_name": "user-service",
"operation": "get_user",
}
err := database.GetUser(userID)
if err != nil {
return ErrDatabaseQuery.New(err).
WithLabels(spanLabels). // Link error to trace
WithContext(Context{
"user_id": userID,
"query": sql,
})
}
}Example 3 - Multi-Dimensional Monitoring:
// Use Case: Complex filtering across multiple dimensions
// Keywords: monitoring, filtering, dimensions, metrics
err := ErrCheckout.New(checkoutErr).
WithLabels(map[string]string{
"severity": "high",
"team": "payments",
"component": "checkout",
"customer_tier": "enterprise",
"payment_method": "credit_card",
"region": "eu-west-1",
}).
WithContext(Context{
"order_id": orderID,
"amount": 9999.99,
})
// Query examples:
// - severity=high AND team=payments
// - component=checkout AND region=eu-west-1
// - customer_tier=enterprise (prioritize enterprise customer issues)Retrieval: GetLabels(err) map[string]string
See Also:
WithLabel()- Add single labelWithTags()- For freeform tagsWithContext()- For detailed structured data
Purpose: Add explicit timestamp for when the error occurred.
Why this matters:
- Async Processing: Track when error occurred vs when it was logged (batch jobs, queues)
- Time-Series Analysis: Accurate error timing for performance analysis
- Distributed Systems: Consistent timestamps across services with clock skew
- Replay Scenarios: Preserve original error time when replaying events
- Audit Trails: Exact error occurrence time for compliance
- Latency Tracking: Measure time between error occurrence and detection
Parameters:
t time.Time- Timestamp when error occurred
Returns: errific error with timestamp attached
Chaining: Can be chained with other methods
When to use:
- ✅ Batch/async processing (error time ≠ log time)
- ✅ Queue processing with delays
- ✅ Event replay systems
- ✅ Distributed systems with clock skew
- ✅ When you need precise error timing
When NOT to use:
- ❌ Synchronous request-response (error timestamp = now)
- ❌ When clock sync is not important
Example 1 - Batch Processing:
// Use Case: Process batch of events, preserve original error times
// Keywords: batch-processing, async, queue, timing
func processBatch(events []Event) {
for _, event := range events {
err := processEvent(event)
if err != nil {
// Error occurred now, but event is from 10 minutes ago
batchErr := ErrEventProcessing.New(err).
WithTimestamp(event.CreatedAt). // Original event time
WithContext(Context{
"event_id": event.ID,
"batch_id": batchID,
"processing_delay_seconds": time.Since(event.CreatedAt).Seconds(),
"processed_at": time.Now(),
})
// Log shows:
// - Event created: 2:00 PM (from timestamp)
// - Processing failed: 2:10 PM (from processed_at context)
// - Delay: 10 minutes
}
}
}Example 2 - Message Queue Processing:
// Use Case: SQS/RabbitMQ message processing with delays
// Keywords: message-queue, sqs, rabbitmq, delay
func processMessage(msg *sqs.Message) error {
// Message was queued 5 minutes ago
enqueuedAt := time.Unix(msg.Attributes["SentTimestamp"], 0)
err := processMessageContent(msg.Body)
if err != nil {
return ErrMessageProcessing.New(err).
WithTimestamp(enqueuedAt). // When message was created
WithContext(Context{
"message_id": msg.MessageId,
"queue_time_seconds": time.Since(enqueuedAt).Seconds(),
"attempt_count": msg.Attributes["ApproximateReceiveCount"],
})
}
}Example 3 - Distributed System Clock Skew:
// Use Case: Consistent timestamps across services with clock differences
// Keywords: distributed-systems, clock-skew, ntp, timing
// Service A (clock is 2 minutes fast)
errTime := time.Now() // 3:02 PM (local clock)
err := ErrServiceA.New().
WithTimestamp(errTime).
WithCorrelationID(traceID)
// Service B (clock is accurate)
// Receives error from Service A
// Can see exact time error occurred on Service A (3:02 PM)
// Even though Service B's clock shows 3:00 PM
// Analysis: Can properly order events across services despite clock skewExample 4 - Event Replay:
// Use Case: Replay historical events while preserving original timestamps
// Keywords: event-sourcing, replay, time-travel, debugging
func replayEvents(events []HistoricalEvent) {
for _, event := range events {
// Replaying event from last week
err := reprocessEvent(event)
if err != nil {
replayErr := ErrEventReplay.New(err).
WithTimestamp(event.OriginalTimestamp). // Last week
WithContext(Context{
"event_id": event.ID,
"original_time": event.OriginalTimestamp,
"replay_time": time.Now(),
"time_difference_days": time.Since(event.OriginalTimestamp).Hours() / 24,
})
// Logs show both:
// - When error originally happened (last week)
// - When replay happened (today)
}
}
}Retrieval: GetTimestamp(err) time.Time
See Also:
WithDuration()- Track operation durationWithContext()- For multiple temporal fields
Purpose: Track how long an operation ran before failing.
Why this matters:
- Performance Analysis: Identify slow operations that fail
- Timeout Debugging: See if errors are timeout-related
- SLA Monitoring: Track operations exceeding SLA thresholds
- Optimization Targets: Find slow operations that need optimization
- Latency Distribution: Understand error latency patterns
- Alerting: Alert when operations consistently take too long before failing
Parameters:
d time.Duration- How long the operation ran before failing
Returns: errific error with duration attached
Chaining: Can be chained with other methods
When to use:
- ✅ Database queries (track slow queries)
- ✅ API calls (measure latency)
- ✅ File operations (track I/O time)
- ✅ Any operation with time limits/SLAs
- ✅ When debugging timeouts
When NOT to use:
- ❌ Instant validation errors (duration is meaningless)
- ❌ When operation time is irrelevant
Example 1 - Slow Database Query:
// Use Case: Track database query duration to find slow queries
// Keywords: database, performance, slow-query, optimization
func queryUsers(ctx context.Context, query string) error {
start := time.Now()
rows, err := db.QueryContext(ctx, query)
duration := time.Since(start)
if err != nil {
return ErrDatabaseQuery.New(err).
WithDuration(duration). // Track how long it took to fail
WithContext(Context{
"query": query,
"duration_ms": duration.Milliseconds(),
"threshold_ms": 1000, // Expected max 1 second
})
}
// Analysis: If duration > 1s, query is slow and needs optimization
// Even if query succeeds, log slow queries for monitoring
if duration > time.Second {
log.Warn("Slow query detected", "duration", duration, "query", query)
}
return nil
}Example 2 - API Timeout Analysis:
// Use Case: Determine if errors are timeout-related
// Keywords: api, timeout, latency, performance
func callExternalAPI(ctx context.Context, endpoint string) error {
client := &http.Client{Timeout: 30 * time.Second}
start := time.Now()
resp, err := client.Get(endpoint)
duration := time.Since(start)
if err != nil {
isTimeout := errors.Is(err, context.DeadlineExceeded)
return ErrAPICall.New(err).
WithDuration(duration). // 30+ seconds if timeout
WithRetryable(isTimeout).
WithContext(Context{
"endpoint": endpoint,
"duration_ms": duration.Milliseconds(),
"timeout_ms": 30000,
"is_timeout": isTimeout,
"duration_vs_timeout": duration.Seconds() / 30.0, // 100% = hit timeout
})
}
// Success but slow (warning)
if duration > 10*time.Second {
log.Warn("Slow API call", "duration", duration, "endpoint", endpoint)
}
return nil
}Example 3 - SLA Monitoring:
// Use Case: Track SLA violations for errors
// Keywords: sla, monitoring, performance, alerting
func processOrder(ctx context.Context, order Order) error {
start := time.Now()
slaThreshold := 5 * time.Second // Orders must complete in 5s
err := orderService.Process(ctx, order)
duration := time.Since(start)
if err != nil {
slaViolation := duration > slaThreshold
orderErr := ErrOrderProcessing.New(err).
WithDuration(duration).
WithContext(Context{
"order_id": order.ID,
"duration_ms": duration.Milliseconds(),
"sla_threshold_ms": slaThreshold.Milliseconds(),
"sla_violation": slaViolation,
"sla_percentage": (duration.Seconds() / slaThreshold.Seconds()) * 100,
})
// Alert on SLA violations
if slaViolation {
alerting.NotifySLA("Order processing exceeded 5s SLA", orderErr)
}
return orderErr
}
return nil
}Example 4 - Performance Comparison:
// Use Case: Compare operation duration across different implementations
// Keywords: performance, comparison, ab-testing, optimization
func searchUsers(query string, useNewAlgorithm bool) error {
start := time.Now()
var err error
if useNewAlgorithm {
err = searchUsersV2(query) // New optimized algorithm
} else {
err = searchUsersV1(query) // Old algorithm
}
duration := time.Since(start)
if err != nil {
return ErrSearch.New(err).
WithDuration(duration).
WithContext(Context{
"query": query,
"algorithm": map[bool]string{true: "v2", false: "v1"}[useNewAlgorithm],
"duration_ms": duration.Milliseconds(),
})
}
// Metrics: Compare v1 vs v2 duration
// Result: v2 is 3x faster (500ms vs 1500ms average)
metrics.RecordDuration("search_duration", duration, map[string]string{
"algorithm": map[bool]string{true: "v2", false: "v1"}[useNewAlgorithm],
})
return nil
}Best Practices:
// ✅ DO: Include duration in context as milliseconds for easy querying
err := ErrSlow.New().
WithDuration(duration).
WithContext(Context{
"duration_ms": duration.Milliseconds(), // Easy to query/graph
})
// ✅ DO: Compare duration to thresholds
WithContext(Context{
"duration_ms": 1500,
"threshold_ms": 1000,
"exceeded_by_ms": 500,
"exceeded_by_percent": 50, // 50% over threshold
})
// ✅ DO: Track both success and error durations for comparison
// Success: 200ms average
// Error: 5000ms average → Errors take 25x longer! (timeout?)
// ❌ DON'T: Use duration for instant errors
err := ErrInvalidEmail.New().
WithDuration(50 * time.Nanosecond) // Meaningless!
// ✅ DO: Use duration only for operations that take measurable time
err := ErrDatabaseQuery.New().
WithDuration(1500 * time.Millisecond) // Meaningful!Retrieval: GetDuration(err) time.Duration
See Also:
WithTimestamp()- Track when error occurredWithRetryAfter()- Specify retry delayWithContext()- For multiple performance metrics
Purpose: Extract structured context from any error.
Parameters: err error - Any error (errific or stdlib)
Returns: Context map or nil if no context
Example:
ctx := GetContext(err)
if ctx != nil {
log.Printf("Query: %s, Duration: %d ms",
ctx["query"], ctx["duration_ms"])
}Purpose: Extract error code from any error.
Parameters: err error - Any error
Returns: Error code string or "" if not set
Example:
if GetCode(err) == "DB_CONN_POOL_EXHAUSTED" {
// Scale up database connections
}Purpose: Extract error category for routing decisions.
Parameters: err error - Any error
Returns: Category or "" if not categorized
Example:
switch GetCategory(err) {
case CategoryNetwork:
return http.StatusServiceUnavailable
case CategoryValidation:
return http.StatusBadRequest
default:
return http.StatusInternalServerError
}Purpose: Check if error should be retried.
Parameters: err error - Any error
Returns: true if retryable, false otherwise
Example:
for attempt := 0; attempt < 3; attempt++ {
err := doWork()
if err == nil || !IsRetryable(err) {
return err
}
time.Sleep(GetRetryAfter(err))
}Purpose: Get suggested retry delay.
Parameters: err error - Any error
Returns: Duration to wait, or 0 if not set
Example:
if IsRetryable(err) {
delay := GetRetryAfter(err)
if delay == 0 {
delay = time.Second // default
}
time.Sleep(delay)
}Purpose: Get maximum retry count.
Parameters: err error - Any error
Returns: Max retries or 0 if not set
Purpose: Get HTTP status code for error.
Parameters: err error - Any error
Returns: HTTP status or 0 if not set
Purpose: Serialize errors to JSON for logging, APIs, and monitoring.
Method: Implement json.Marshaler interface
Output Format:
{
"error": "error message",
"code": "ERR_001",
"category": "server",
"caller": "file.go:42.FunctionName",
"context": {"key": "value"},
"retryable": true,
"retry_after": "5s",
"max_retries": 3,
"http_status": 503,
"stack": ["frame1", "frame2"],
"wrapped": ["wrapped error 1", "wrapped error 2"]
}Example:
err := ErrDB.New().
WithCode("DB_001").
WithContext(Context{"query": sql})
jsonBytes, _ := json.Marshal(err)
logger.Error(string(jsonBytes))Purpose: Set global error formatting options.
Thread-Safety: Safe for concurrent calls (mutex protected).
Options:
Suffix(default) - Caller info at end:error [file.go:42.Func]Prefix- Caller info at start:[file.go:42.Func] errorDisabled- No caller info:errorNewline(default) - Stack on newlinesInline- Stack inline with ↩ separatorWithStack- Include full stack traceTrimPrefixes(prefixes...)- Remove path prefixesTrimCWD- Trim current working directory
Example:
Configure(Suffix, Newline) // default
Configure(Prefix, WithStack)
Configure(TrimCWD)1. Check IsRetryable(err)
└─ false → Don't retry, handle or return
└─ true → Continue to step 2
2. Check GetCategory(err)
├─ CategoryValidation → Don't retry (input error)
├─ CategoryUnauthorized → Don't retry (auth error)
├─ CategoryNetwork → Retry immediately
├─ CategoryTimeout → Retry with increased timeout
└─ CategoryServer → Retry with exponential backoff
3. Get retry parameters
├─ delay := GetRetryAfter(err)
├─ max := GetMaxRetries(err)
└─ Implement retry loop with these values
4. Check context for more details
└─ ctx := GetContext(err)
└─ Make decisions based on context values
1. Check GetHTTPStatus(err)
└─ status != 0 → Return that status
└─ status == 0 → Continue to step 2
2. Check GetCategory(err)
├─ CategoryClient → 400 Bad Request
├─ CategoryValidation → 400 Bad Request
├─ CategoryUnauthorized → 401/403
├─ CategoryNotFound → 404 Not Found
├─ CategoryTimeout → 408/504
├─ CategoryNetwork → 503 Service Unavailable
└─ CategoryServer → 500 Internal Server Error
3. Serialize to JSON for response body
└─ json.Marshal(err) → Include in response
1. Get severity from category
├─ CategoryValidation → INFO/WARN
├─ CategoryClient → WARN
├─ CategoryTimeout → WARN
├─ CategoryNetwork → ERROR
└─ CategoryServer → ERROR/CRITICAL
2. Serialize to JSON
└─ json.Marshal(err) → Structured log entry
3. Extract context for additional fields
└─ GetContext(err) → Add to log metadata
4. Use code for grouping/alerts
└─ GetCode(err) → Alert routing key
var ErrDatabaseQuery Err = "database query failed"
func QueryUsers(db *sql.DB) error {
start := time.Now()
rows, err := db.Query("SELECT * FROM users")
if err != nil {
return ErrDatabaseQuery.New(err).
WithCode("DB_QUERY_001").
WithCategory(CategoryServer).
WithContext(Context{
"query": "SELECT * FROM users",
"duration_ms": time.Since(start).Milliseconds(),
"table": "users",
}).
WithRetryable(true).
WithRetryAfter(5 * time.Second).
WithMaxRetries(3)
}
defer rows.Close()
// ...
}var ErrAPICall Err = "external API call failed"
func CallPaymentAPI(req Request) error {
start := time.Now()
resp, err := http.Post(url, "application/json", body)
if err != nil {
return ErrAPICall.New(err).
WithCode("API_PAYMENT_TIMEOUT").
WithCategory(CategoryTimeout).
WithContext(Context{
"endpoint": url,
"method": "POST",
"duration_ms": time.Since(start).Milliseconds(),
"retry_count": req.RetryCount,
}).
WithRetryable(true).
WithRetryAfter(10 * time.Second).
WithMaxRetries(3).
WithHTTPStatus(504)
}
// ...
}var ErrValidation Err = "validation failed"
func ValidateEmail(email string) error {
if !strings.Contains(email, "@") {
return ErrValidation.New().
WithCode("VAL_EMAIL_FORMAT").
WithCategory(CategoryValidation).
WithContext(Context{
"field": "email",
"value": email, // Be careful with PII
"constraint": "must contain @",
}).
WithRetryable(false).
WithHTTPStatus(400)
}
return nil
}A: Context is only available on errific errors. Check if you're wrapping with stdlib fmt.Errorf:
// ❌ This loses context
return fmt.Errorf("wrapper: %w", errWithContext)
// ✅ Use errific methods
return ErrWrapper.New(errWithContext)A: Replace pkg/errors calls with errific equivalents:
// pkg/errors
errors.Wrap(err, "message")
// errific
ErrType.New(err)
// pkg/errors
errors.Wrapf(err, "format %s", arg)
// errific
ErrType.Wrapf("format %s: %w", arg, err)A: Yes, use helper functions which work with any error:
standardErr := errors.New("standard")
GetCode(standardErr) // Returns ""
IsRetryable(standardErr) // Returns falsefunc TestError(t *testing.T) {
err := ErrDB.New().WithCode("DB_001")
// Test error type
assert.True(t, errors.Is(err, ErrDB))
// Test metadata
assert.Equal(t, "DB_001", GetCode(err))
}Scenario: Building a REST API with consistent error responses
package main
import (
"encoding/json"
"net/http"
"time"
"github.com/leefernandes/errific"
)
// Define application errors
var (
ErrInvalidInput errific.Err = "invalid input"
ErrUnauthorized errific.Err = "unauthorized"
ErrDBQuery errific.Err = "database query failed"
ErrNotFound errific.Err = "resource not found"
)
type User struct {
ID string `json:"id"`
Email string `json:"email"`
}
// API Handler
func GetUserHandler(w http.ResponseWriter, r *http.Request) {
userID := r.URL.Query().Get("id")
user, err := getUser(userID)
if err != nil {
respondError(w, err)
return
}
json.NewEncoder(w).Encode(user)
}
// Business Logic with errific
func getUser(id string) (*User, error) {
// Validation
if id == "" {
return nil, ErrInvalidInput.New().
WithCode("VAL_USER_ID_REQUIRED").
WithCategory(errific.CategoryValidation).
WithHTTPStatus(400).
WithContext(errific.Context{
"field": "id",
"message": "user ID is required",
})
}
// Authorization
if !hasPermission(id, "users:read") {
return nil, ErrUnauthorized.New().
WithCode("AUTH_USER_ACCESS_DENIED").
WithCategory(errific.CategoryUnauthorized).
WithHTTPStatus(403).
WithContext(errific.Context{
"user_id": id,
"required_permission": "users:read",
})
}
// Database Query
user, err := db.QueryUser(id)
if err == sql.ErrNoRows {
return nil, ErrNotFound.New().
WithCode("USER_NOT_FOUND").
WithCategory(errific.CategoryNotFound).
WithHTTPStatus(404).
WithContext(errific.Context{"user_id": id})
}
if err != nil {
return nil, ErrDBQuery.New(err).
WithCode("DB_QUERY_USER_FAILED").
WithCategory(errific.CategoryServer).
WithHTTPStatus(500).
WithContext(errific.Context{
"query": "SELECT * FROM users WHERE id = ?",
"user_id": id,
}).
WithRetryable(true).
WithRetryAfter(5 * time.Second)
}
return user, nil
}
// Error Response Handler
func respondError(w http.ResponseWriter, err error) {
status := errific.GetHTTPStatus(err)
if status == 0 {
status = 500 // Default
}
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(status)
json.NewEncoder(w).Encode(map[string]interface{}{
"error": err, // errific implements json.Marshaler
})
}Example Responses:
// 400 Bad Request
{
"error": {
"error": "invalid input",
"code": "VAL_USER_ID_REQUIRED",
"category": "validation",
"caller": "api/users.go:45.getUser",
"context": {
"field": "id",
"message": "user ID is required"
},
"http_status": 400
}
}
// 500 Internal Server Error
{
"error": {
"error": "database query failed: connection timeout",
"code": "DB_QUERY_USER_FAILED",
"category": "server",
"caller": "api/users.go:78.getUser",
"context": {
"query": "SELECT * FROM users WHERE id = ?",
"user_id": "user-123"
},
"retryable": true,
"retry_after": "5s",
"http_status": 500
}
}Scenario: Building an MCP server that LLMs can interact with
package main
import (
"encoding/json"
"fmt"
"github.com/leefernandes/errific"
)
// MCP Tool Errors
var (
ErrToolNotFound errific.Err = "tool not found"
ErrInvalidParams errific.Err = "invalid tool parameters"
ErrToolExecution errific.Err = "tool execution failed"
)
type MCPRequest struct {
JSONRPC string `json:"jsonrpc"`
ID string `json:"id"`
Method string `json:"method"`
Params map[string]interface{} `json:"params"`
CorrelationID string `json:"correlation_id,omitempty"`
}
type MCPResponse struct {
JSONRPC string `json:"jsonrpc"`
ID string `json:"id"`
Result interface{} `json:"result,omitempty"`
Error *errific.MCPError `json:"error,omitempty"`
}
// MCP Server Handler
func HandleMCPRequest(r *MCPRequest) *MCPResponse {
// Validate method exists
if !toolRegistry.Has(r.Method) {
err := ErrToolNotFound.New().
WithMCPCode(errific.MCPMethodNotFound).
WithHelp(fmt.Sprintf("Tool '%s' is not available", r.Method)).
WithSuggestion("Use the 'list_tools' method to see available tools").
WithDocs("https://docs.example.com/mcp/tools").
WithTags("mcp", "tool-not-found", "validation")
mcpErr := errific.ToMCPError(err)
return &MCPResponse{
JSONRPC: "2.0",
ID: r.ID,
Error: &mcpErr,
}
}
// Validate parameters
if err := validateToolParams(r.Method, r.Params); err != nil {
toolErr := ErrInvalidParams.New(err).
WithMCPCode(errific.MCPInvalidParams).
WithHelp("The parameters provided do not match the tool schema").
WithSuggestion("Check the tool documentation for required parameters").
WithDocs(fmt.Sprintf("https://docs.example.com/mcp/tools/%s", r.Method)).
WithTags("mcp", "invalid-params", "validation").
WithContext(errific.Context{
"tool": r.Method,
"provided_params": r.Params,
})
mcpErr := errific.ToMCPError(toolErr)
return &MCPResponse{
JSONRPC: "2.0",
ID: r.ID,
Error: &mcpErr,
}
}
// Execute tool
result, err := toolRegistry.Execute(r.Method, r.Params)
if err != nil {
// Tool execution failed - create rich error
execErr := ErrToolExecution.New(err).
WithMCPCode(errific.MCPToolError).
WithCorrelationID(r.CorrelationID).
WithRequestID(r.ID).
WithHelp(getToolHelp(r.Method, err)).
WithSuggestion(getToolSuggestion(r.Method, err)).
WithDocs(getToolDocs(r.Method)).
WithTags("mcp", "tool-error", getToolCategory(r.Method)).
WithLabels(map[string]string{
"tool_name": r.Method,
"error_type": classifyError(err),
"severity": calculateSeverity(err),
}).
WithRetryable(isRetryable(err)).
WithRetryAfter(getRetryDelay(err))
mcpErr := errific.ToMCPError(execErr)
return &MCPResponse{
JSONRPC: "2.0",
ID: r.ID,
Error: &mcpErr,
}
}
// Success
return &MCPResponse{
JSONRPC: "2.0",
ID: r.ID,
Result: result,
}
}
// Helper functions
func getToolHelp(toolName string, err error) string {
// Return context-specific help based on error
switch {
case isDatabaseError(err):
return "Database connection pool exhausted. The database is under heavy load."
case isNetworkError(err):
return "Network connectivity issue. Unable to reach external service."
default:
return fmt.Sprintf("Tool '%s' encountered an unexpected error", toolName)
}
}
func getToolSuggestion(toolName string, err error) string {
switch {
case isDatabaseError(err):
return "Retry in 10 seconds when connections are released, or simplify your query."
case isNetworkError(err):
return "Check network connectivity and retry in 30 seconds."
default:
return "Contact support if the issue persists."
}
}LLM Receives (for database error):
{
"jsonrpc": "2.0",
"id": "req-789",
"error": {
"code": -32000,
"message": "tool execution failed: database connection failed",
"data": {
"error": "tool execution failed",
"code": "TOOL_DB_CONN_FAILED",
"correlation_id": "trace-abc-123",
"request_id": "req-789",
"help": "Database connection pool exhausted. The database is under heavy load.",
"suggestion": "Retry in 10 seconds when connections are released, or simplify your query.",
"docs": "https://docs.example.com/mcp/tools/search_database#errors",
"tags": ["mcp", "tool-error", "database"],
"labels": {
"tool_name": "search_database",
"error_type": "connection",
"severity": "high"
},
"retryable": true,
"retry_after": "10s",
"caller": "tools/search.go:45.Execute"
}
}
}LLM Decision Making:
- Read
help→ Explain to user: "The database is too busy" - Read
suggestion→ Take action: Wait 10s and retry - Check
retryable→ Decide: Yes, safe to retry - Read
docs→ Provide link to user - Check
labels.severity→ Know: This is high priority
Scenario: Tracing errors across multiple microservices
package main
import (
"context"
"github.com/google/uuid"
"github.com/leefernandes/errific"
)
// Service-specific errors
var (
// Gateway errors
ErrGatewayAuth errific.Err = "gateway authentication failed"
// User service errors
ErrUserNotFound errific.Err = "user not found"
ErrUserQuery errific.Err = "user query failed"
// Database service errors
ErrDBConnection errific.Err = "database connection failed"
ErrDBQuery errific.Err = "database query failed"
)
// ============================================================
// Service A: API Gateway
// ============================================================
func (gw *Gateway) HandleRequest(w http.ResponseWriter, r *http.Request) {
// Create correlation ID for entire request chain
correlationID := uuid.New().String()
requestID := uuid.New().String()
ctx := context.WithValue(r.Context(), "correlation_id", correlationID)
ctx = context.WithValue(ctx, "request_id", requestID)
userID := r.Header.Get("X-User-ID")
// Call user service
user, err := gw.userService.GetUser(ctx, userID)
if err != nil {
// Log with correlation tracking
log.Error("request failed",
"correlation_id", errific.GetCorrelationID(err),
"request_id", errific.GetRequestID(err),
"service_chain", "gateway → user-service → db-service",
"error", err)
respondError(w, err)
return
}
json.NewEncoder(w).Encode(user)
}
// ============================================================
// Service B: User Service
// ============================================================
func (us *UserService) GetUser(ctx context.Context, userID string) (*User, error) {
correlationID := ctx.Value("correlation_id").(string)
requestID := ctx.Value("request_id").(string)
// Query database service
user, err := us.dbService.QueryUser(ctx, userID)
if err != nil {
// Wrap error with service context
return nil, ErrUserQuery.New(err).
WithCorrelationID(correlationID).
WithRequestID(requestID).
WithLabel("service", "user-service").
WithLabel("operation", "get_user").
WithContext(errific.Context{
"user_id": userID,
})
}
return user, nil
}
// ============================================================
// Service C: Database Service
// ============================================================
func (db *DatabaseService) QueryUser(ctx context.Context, userID string) (*User, error) {
correlationID := ctx.Value("correlation_id").(string)
requestID := ctx.Value("request_id").(string)
query := "SELECT id, email, name FROM users WHERE id = ?"
var user User
err := db.conn.QueryRowContext(ctx, query, userID).Scan(&user.ID, &user.Email, &user.Name)
if err == sql.ErrNoRows {
return nil, ErrUserNotFound.New().
WithCorrelationID(correlationID).
WithRequestID(requestID).
WithLabel("service", "db-service").
WithLabel("operation", "query_user").
WithHTTPStatus(404).
WithContext(errific.Context{
"query": query,
"user_id": userID,
})
}
if err != nil {
return nil, ErrDBQuery.New(err).
WithCorrelationID(correlationID). // Same correlation ID!
WithRequestID(requestID).
WithLabel("service", "db-service").
WithLabel("operation", "query_user").
WithHTTPStatus(500).
WithRetryable(true).
WithRetryAfter(5 * time.Second).
WithContext(errific.Context{
"query": query,
"user_id": userID,
})
}
return &user, nil
}Log Output (with correlation tracking):
// All logs from the same request have the same correlation_id
{
"level": "error",
"service": "gateway",
"correlation_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"request_id": "req-abc-123",
"message": "request failed",
"error": {
"error": "user query failed: database query failed: connection timeout",
"correlation_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"request_id": "req-abc-123",
"labels": {
"service": "user-service",
"operation": "get_user"
}
}
}Benefits:
- ✅ Single correlation ID traces through all services
- ✅ Each service adds its own context
- ✅ Easy to find all logs for a single request in log aggregation
- ✅ Service labels enable filtering by service in monitoring
Benchmarks (Go 1.23, Apple M1, 14 cores):
BenchmarkErrNew 2,245,603 ~523 ns/op 680 B/op 9 allocs/op
BenchmarkErrError 10,321,795 ~115 ns/op 192 B/op 6 allocs/op
BenchmarkWithContext 1,953,436 ~616 ns/op 696 B/op 9 allocs/op
BenchmarkJSONMarshal 1,391,380 ~862 ns/op 1105 B/op 7 allocs/op
BenchmarkWithCorrelationID 2,121,558 ~567 ns/op 688 B/op 8 allocs/op
BenchmarkWithTags 2,084,160 ~575 ns/op 720 B/op 9 allocs/op
BenchmarkToMCPError 1,596,895 ~743 ns/op 1730 B/op 6 allocs/op
BenchmarkCompleteErrorChain 1,758,771 ~680 ns/op 720 B/op 9 allocs/op
Overhead: Sub-microsecond for most operations, negligible for error handling.
Memory: ~680-720 bytes per error with metadata, ~1KB with MCP conversion.
Thread Safety: All operations are thread-safe with minimal lock contention.
- Go Version: 1.20+
- Dependencies: None (stdlib only)
- Thread-Safety: Full (mutex-protected configuration)
- Breaking Changes: None (fully backward compatible)