Token API Validator

Validation service that tracks the accuracy of Token API data by comparing responses against reference providers (Etherscan, Blockscout).

Runs on a schedule, stores results in ClickHouse, and exposes Prometheus metrics for Grafana dashboards.

Note

See docs/methodology.md for detailed documentation on what is compared, tolerance thresholds, accuracy/coverage metrics, and known limitations.

Quick Start

# Install dependencies
bun install

# Copy and configure environment
cp .env.example .env

# Generate reference token list (requires COINGECKO_API_KEY)
bun run fetch-tokens

# Create ClickHouse tables and views
bun run init-db

# Start the service
bun run dev

Endpoints

Route	Method	Description
`/health`	GET	Liveness check
`/trigger`	POST	Trigger manual validation run
`/status`	GET	Current run status and progress
`/report`	GET	Latest run report (metrics, regressions, mismatches)
`/metrics`	GET	Prometheus metrics

Report

GET /report returns a JSON object with the latest validation run and per-domain results:

run — Run summary: ID, timestamps, trigger type (scheduled/manual), status, and aggregate totals across all domains.
metadata / balance — Per-domain results, each containing:
- metrics — Accuracy, adjusted accuracy (fresh data only), coverage, and underlying counts. See methodology for definitions.
- regressions — Active regressions: comparisons in a sustained mismatch state.
- mismatches — Current-run mismatches (non-regression): comparable fields that didn't match.

Returns 404 if no completed runs exist.

Example response

{
  "run": {
    "run_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "started_at": "2026-03-11 12:00:00",
    "completed_at": "2026-03-11 12:05:32",
    "trigger": "scheduled",
    "tokens_checked": 743,
    "comparisons": 148260,
    "matches": 140012,
    "mismatches": 5123,
    "nulls": 3125,
    "errors": 0,
    "status": "success",
    "error_detail": null
  },
  "metadata": {
    "metrics": {
      "run_at": "2026-03-11 12:00:00",
      "matches": 3412,
      "mismatches": 123,
      "nulls": 125,
      "comparable": 3535,
      "accuracy": 0.9652,
      "adjusted_accuracy": 0.9891,
      "coverage": 0.9658,
      "total_comparisons": 3660
    },
    "regressions": [
      {
        "network": "mainnet",
        "contract": "0xdac17f958d2ee523a2206206994597c13d831ec7",
        "symbol": "USDT",
        "field": "total_supply",
        "entity": "",
        "provider": "blockscout",
        "our_value": "96119620139.51",
        "reference_value": "96118349783.47",
        "relative_diff": 0.0000132,
        "tolerance": 0.01,
        "our_url": "https://token-api.thegraph.com/...",
        "reference_url": "https://eth.blockscout.com/api/..."
      }
    ],
    "mismatches": []
  },
  "balance": {
    "metrics": {
      "run_at": "2026-03-11 12:00:00",
      "matches": 136600,
      "mismatches": 5000,
      "nulls": 3000,
      "comparable": 141600,
      "accuracy": 0.9647,
      "adjusted_accuracy": 0.9812,
      "coverage": 0.9793,
      "total_comparisons": 144600
    },
    "regressions": [],
    "mismatches": []
  }
}

Environment Variables

Variable	Required	Default	Description
`CLICKHOUSE_URL`	Yes	—	ClickHouse HTTP endpoint
`CLICKHOUSE_USERNAME`	Yes	—	ClickHouse user
`CLICKHOUSE_PASSWORD`	Yes	—	ClickHouse password
`CLICKHOUSE_DATABASE`	No	`validation`	ClickHouse database name
`TOKEN_API_BASE_URL`	Yes	—	Token API base URL
`TOKEN_API_JWT`	Yes	—	Bearer JWT for Token API authentication (quick start)
`ETHERSCAN_API_KEY`	No	—	Etherscan V2 paid API key (single key, works across all chains)
`COINGECKO_API_KEY`	No	—	CoinGecko API key (only used by `fetch-tokens` script, not at runtime)
`CRON_SCHEDULE`	No	`0 /6 * *`	Validation run cron schedule
`RATE_LIMIT_MS`	No	`500`	Delay between provider requests within a network (ms)
`RETRY_MAX_ATTEMPTS`	No	`3`	Max retry attempts for failed requests
`RETRY_BASE_DELAY_MS`	No	`1000`	Base delay for exponential backoff (ms)
`PORT`	No	`3000`	HTTP server port
`VERBOSE`	No	`false`	Enable verbose logging
`PRETTY_LOGGING`	No	`false`	Pretty-print log output

Prometheus Metrics

Metric	Type	Labels	Description
`validator_runs_total`	Counter	`trigger`, `status`	Validation runs completed
`validator_run_duration_seconds`	Histogram	—	Run wall-clock duration
`validator_tokens_checked_total`	Counter	`network`	Tokens checked across runs
`validator_provider_requests_total`	Counter	`provider`, `network`, `endpoint`, `status`	Provider API requests
`validator_provider_request_duration_seconds`	Histogram	`provider`, `endpoint`	Provider request duration
`validator_provider_batch_requests_total`	Counter	`provider`, `network`, `status`	Batch API requests
`validator_provider_batch_fallbacks_total`	Counter	`provider`, `network`	Batch requests that fell back to individual fetches
`validator_provider_batch_size`	Histogram	`provider`, `network`	Number of items per batch request
`validator_clickhouse_writes_total`	Counter	`status`	ClickHouse write operations

Default process metrics (memory, CPU, event loop lag) are also exported.

ClickHouse Schema

Schema is defined in schema/*.sql and applied with bun run init-db (idempotent — safe to run repeatedly).

erDiagram
    runs {
        String run_id PK
        DateTime started_at
        DateTime completed_at
        Enum trigger "scheduled | manual"
        UInt32 tokens_checked
        UInt32 comparisons
        UInt32 matches
        UInt32 mismatches
        UInt32 nulls
        UInt32 errors
        Enum status "success | partial | failed"
        String error_detail
    }

    comparisons {
        String run_id FK
        DateTime run_at
        String domain "metadata | balance"
        String network
        String contract
        String symbol
        String field
        String entity "domain-specific subject key"
        String our_value
        String reference_value
        String provider
        Float64 relative_diff
        Bool is_match
        Float64 tolerance
        DateTime our_fetched_at
        DateTime reference_fetched_at
        DateTime our_block_timestamp
        String our_url
        String reference_url
        String our_null_reason
        String reference_null_reason
    }

    comparison_enriched {
        Bool is_comparable "both sides have data"
        Bool is_fresh "indexed within 5 min"
    }

    run_metrics {
        DateTime run_at
        String domain
        Float64 accuracy
        Float64 adjusted_accuracy
        Float64 coverage
    }

    regression_status {
        String domain
        Bool is_regression "exact mismatch or sustained relative"
    }

    runs ||--o{ comparisons : "run_id"
    comparisons ||--|| comparison_enriched : "adds computed booleans"
    comparison_enriched ||--o{ run_metrics : "GROUP BY run_at, domain"
    comparison_enriched ||--|| regression_status : "PARTITION BY domain, ..."

Also includes accuracy_by_field and accuracy_by_network views (same pattern as run_metrics, grouped by domain+field / domain+network). See schema/ for full definitions.

Scripts

bun run fetch-tokens — Refresh tokens.json from CoinGecko (top tokens by market cap)
bun run init-db — Create ClickHouse tables and views (idempotent)

Blockscout URLs and chain IDs are resolved via The Graph Network Registry, synced at startup and before each run. Etherscan uses the V2 unified API (api.etherscan.io/v2/api?chainid=...) — a single API key works across all supported chains.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.github/workflows		.github/workflows
docs		docs
schema		schema
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
biome.jsonc		biome.jsonc
bun.lock		bun.lock
package.json		package.json
tokens.json		tokens.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Token API Validator

Quick Start

Endpoints

Report

Environment Variables

Prometheus Metrics

ClickHouse Schema

Scripts

About

Uh oh!

Releases 11

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Token API Validator

Quick Start

Endpoints

Report

Environment Variables

Prometheus Metrics

ClickHouse Schema

Scripts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages