- Copy the environment template and fill in values:
cp .env.example .env
- Ensure Docker is running.
- Start the stack:
make up
- Copy the environment template and fill in values:
For production defaults (retention + backups enabled), start from:
cp .env.example .env
cp .env.prod.example .env
- Set production flags in
.env:REQUIRE_API_KEY=trueAPI_KEY=...DEBUG=false(debug router is not mounted)RATE_LIMIT_BACKEND=external(enforce limits at your ingress)NEXT_PUBLIC_API_BASE_URL=https://<your-domain>/apiNEXT_PUBLIC_API_KEY=...(if using the web UI and you want it preconfigured)
- Place TLS certs in
ops/nginx/certs/:ops/nginx/certs/fullchain.pemops/nginx/certs/privkey.pemProvision via Let's Encrypt (certbot) or your certificate authority.
- Build and start the production stack:
make build-prod && make up-prod
Recommended: terminate TLS and enforce rate limiting at your gateway/ingress (nginx, ALB,
Cloudflare, etc). Configure metrics and tracing explicitly with METRICS_* and OTEL_*
flags when deploying to production.
Notes:
- Nginx terminates TLS and proxies
/to the web app and/apito the API. The API and web services are not published directly in production compose. - The production image uses a wheel-based install and requires
constraints.lockfor deterministic builds. Regenerate it when dependency versions change. - Update
ops/nginx/nginx.confset_real_ip_fromentries to match your ingress/LB networks so rate limiting uses correct client IPs. - Run migrations as a one-off job before scaling API replicas:
make migrate-prod
- Postgres/Redis ports are not published in production compose. For local debugging, add
ports:entries or create an override file. - If you change
POSTGRES_USER/POSTGRES_PASSWORD/POSTGRES_DB, updateDATABASE_URLto match.
OPENAI_API_KEYOPENAI_MODEL(default:gpt-4o-mini)OPENAI_EMBED_MODEL(default:text-embedding-3-small)AI_PROVIDER(openaiorfake, default:openai)DEBUG(default:false; keepfalsein production)DATABASE_URLREDIS_URLMAX_CHUNKS_PER_QUERY(default:8)RERANK_ENABLED(default:true)RERANK_CANDIDATES(default:30)RERANK_SNIPPET_CHARS(default:900)CHUNK_CHAR_TARGET(default:5000)CHUNK_CHAR_OVERLAP(default:800)
API_KEY(if set, sendX-API-Keyon requests; required in production ifREQUIRE_API_KEY=true)REQUIRE_API_KEY(default:false; settruein production to requireAPI_KEYat startup)RATE_LIMIT_BACKEND(memoryorexternal, default:memory; useexternalin production)NEXT_PUBLIC_API_KEY(preconfigures the web UI API key)RATE_LIMIT_RPS(default:0, disabled when0; use only withRATE_LIMIT_BACKEND=memory)RATE_LIMIT_BURST(default:0; use only withRATE_LIMIT_BACKEND=memory)MMR_ENABLED(default:true)MMR_LAMBDA(default:0.7)MMR_CANDIDATES(default:30)MAX_PDF_BYTES(default:25000000)MAX_PDF_PAGES(default:300)MAX_URL_BYTES(default:2000000)MAX_TEXT_BYTES(default:2000000)EMBED_BATCH_SIZE(default:64)EMBED_DIM(default:1536; must match the pgvector column size)OPENAI_TIMEOUT_SECONDS(default:30)OPENAI_MAX_RETRIES(default:3)POSTGRES_USER(default:postgres; compose only)POSTGRES_PASSWORD(default:postgres; compose only)POSTGRES_DB(default:lfcie; compose only)DB_POOL_SIZE(default:5)DB_MAX_OVERFLOW(default:10)DB_POOL_TIMEOUT(default:30)DB_POOL_RECYCLE(default:1800)DB_CONNECT_TIMEOUT(default:10)LOG_LEVEL(default:INFO)METRICS_ENABLED(default:true)METRICS_PATH(default:/metrics)OTEL_ENABLED(default:false)OTEL_SERVICE_NAME(default:long-form-content-intelligence-api)OTEL_EXPORTER_OTLP_ENDPOINT(default: unset; uses OpenTelemetry defaults)URL_ALLOWLIST(default: empty; comma-separated hostnames allowed for URL ingest; use*.example.comor.example.comfor subdomains)STORAGE_ROOT(default:storage; relative paths are resolved from the repo root)WORKER_CONCURRENCY(default:2)WORKER_PREFETCH_MULTIPLIER(default:1)WORKER_MAX_TASKS_PER_CHILD(default:100)WORKER_VISIBILITY_TIMEOUT(default:3600)WORKER_TASK_TIME_LIMIT(default:0, disabled when0)WORKER_TASK_SOFT_TIME_LIMIT(default:0, disabled when0)RETENTION_ENABLED(default:false)RETENTION_DAYS_SOURCES(default:0, disabled when0)RETENTION_DAYS_QUERIES(default:0, disabled when0)RETENTION_DAYS_ANSWERS(default:0, disabled when0)RETENTION_BATCH_SIZE(default:200)RETENTION_INTERVAL_SECONDS(default:86400)BACKUP_INTERVAL_SECONDS(default:86400)BACKUP_RETENTION_DAYS(default:7)
Production rate limiting: set RATE_LIMIT_BACKEND=external and enforce limits at your
gateway/ingress (nginx, Cloudflare, ALB, etc). The in-app limiter is in-memory and
intended for dev or single-worker use only.
REQUIRE_API_KEY=trueAPI_KEYset to a non-empty valueDEBUG=false(debug routes not mounted)RATE_LIMIT_BACKEND=externaland gateway/ingress rate limiting configuredRETENTION_ENABLED=truewith retention windows set forRETENTION_DAYS_*make migrate-prodrun before scaling API replicasNEXT_PUBLIC_API_KEYset (or configured via the web UI settings drawer)- Backups enabled via the compose
backupprofile (see below)
Retention runs in the maintenance service. Enable it by setting RETENTION_ENABLED=true
and choose retention windows (days) for sources/queries/answers. The service runs at
RETENTION_INTERVAL_SECONDS and deletes old rows plus source files on disk.
Backups are provided by the optional backup compose profile (Postgres pg_dump).
Enable it with:
docker compose -f docker-compose.prod.yml --profile backup up -dBackups are written to the backups_data volume using BACKUP_INTERVAL_SECONDS and
pruned after BACKUP_RETENTION_DAYS.
- Ensure the backup profile is running:
docker compose -f docker-compose.prod.yml --profile backup up -d
- List available dumps:
docker compose -f docker-compose.prod.yml exec backup ls -1 /backups - Stop write-heavy services:
docker compose -f docker-compose.prod.yml stop api worker maintenance
- Restore the chosen dump (destructive, replaces current DB):
docker compose -f docker-compose.prod.yml exec backup sh -c \ 'pg_restore -h postgres -U $POSTGRES_USER -d $POSTGRES_DB --clean --if-exists /backups/<dump-file>.dump'
- Start services again:
docker compose -f docker-compose.prod.yml start api worker maintenance
Handy commands:
make retention-prod(run retention once)make retention-prod-dry-run(preview retention deletes)make backup-prod(start the backup profile)
make upThe API will be available at http://localhost:8000.
The UI will be available at http://localhost:3000.
Option A (Docker UI, 2 terminals):
Terminal 1 (stack + UI):
cd /Users/ignaziodesantis/Desktop/Development/Long-Form-Content-Intelligence-Engine
AI_PROVIDER=fake DEBUG=true REQUIRE_API_KEY=false docker compose up --buildTerminal 2 (smoke test):
cd /Users/ignaziodesantis/Desktop/Development/Long-Form-Content-Intelligence-Engine
source .venv/bin/activate
make smokeOption B (Local UI, 3 terminals):
Terminal 1 (stack only, no UI):
cd /Users/ignaziodesantis/Desktop/Development/Long-Form-Content-Intelligence-Engine
AI_PROVIDER=fake DEBUG=true REQUIRE_API_KEY=false docker compose up --build postgres redis api workerTerminal 2 (smoke test):
cd /Users/ignaziodesantis/Desktop/Development/Long-Form-Content-Intelligence-Engine
source .venv/bin/activate
make smokeTerminal 3 (Next.js UI):
cd /Users/ignaziodesantis/Desktop/Development/Long-Form-Content-Intelligence-Engine/apps/web
npm install
cp .env.local.example .env.local
npm run devFor production, set REQUIRE_API_KEY=true and define a non-empty API_KEY.
Debug endpoints under /debug/* are only mounted when DEBUG=true and are excluded from
OpenAPI when DEBUG=false (recommended for production).
Metrics are exposed via Prometheus text format on METRICS_PATH when METRICS_ENABLED=true.
Route labels use FastAPI route templates to avoid high-cardinality raw paths.
Example:
curl http://localhost:8000/metricsTracing is disabled by default. To enable OpenTelemetry exporting:
OTEL_ENABLED=true \
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318/v1/traces \
make upWith your venv active, run:
python3 -m pip install -U pip
python3 -m pip install -e ".[dev]"pip install -e ".[dev]" provides pytest, ruff, and mypy for make test and make lint.
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install -e ".[dev]"
make test
make lintHealth check:
curl http://localhost:8000/healthDependency health:
curl http://localhost:8000/health/depsUpload a PDF:
curl -F "file=@/path/to/document.pdf" -F "title=My Doc" \
http://localhost:8000/sources/uploadIngest raw text:
curl -X POST http://localhost:8000/sources/ingest \
-H "Content-Type: application/json" \
-d '{"text": "Paste long-form text here.", "title": "My Text"}'Ingest a URL:
curl -X POST http://localhost:8000/sources/ingest \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "title": "Example"}'List sources:
curl http://localhost:8000/sourcesList sources with pagination and filtering:
curl "http://localhost:8000/sources?limit=25&offset=0&status=READY&source_type=pdf"Query with RAG:
curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-d '{"question": "What is the main thesis?", "source_ids": ["YOUR_SOURCE_UUID"]}'Optional idempotency (replays return the same answer):
curl -X POST http://localhost:8000/query \
-H "Content-Type: application/json" \
-H "Idempotency-Key: 5f84a3a1-6e8d-4c8f-9b2f-3d1c40f5f6b1" \
-d '{"question": "What is the main thesis?", "source_ids": ["YOUR_SOURCE_UUID"]}'Query with verification:
curl -X POST http://localhost:8000/query/verified \
-H "Content-Type: application/json" \
-d '{"question": "What is the main thesis?", "source_ids": ["YOUR_SOURCE_UUID"]}'Persisted-answer read endpoints (API key required if API_KEY is set):
GET /answers/{answer_id}GET /answers/{answer_id}/highlightsGET /answers/{answer_id}/groupedGET /answers/{answer_id}/grouped/highlightsList answers with pagination:
curl "http://localhost:8000/answers?limit=25&offset=0"Filter answers by query:
curl "http://localhost:8000/answers?query_id=YOUR_QUERY_UUID"Legacy-tolerant hydration:
- Missing
verification_summary/answer_styleis derived and normalized to satisfy strict contracts. - Summary counts and
answer_styleare repaired to stay consistent with claims/verdicts. raw_citationsmay be missing or malformed (non-dict) and is treated as{}.citations_countuseslen(raw_citations["ids"])if it is a list; otherwise falls back tolen(citations).- Summary input selection prefers raw claims only if coerced claims are non-empty; otherwise uses raw highlights if list.
- Non-fatal consistency logging emits
verification_summary_inconsistentwhen repaired payloads still mismatch. Citations behavior: - If
raw_citations.citationsorraw_citations.citation_groupsare persisted, they are returned. - If only legacy ids exist, citations may be empty; normalization still uses ids length as
citations_count.
Delete a source:
curl -X DELETE http://localhost:8000/sources/YOUR_SOURCE_UUIDRun tests:
make testLint and type-check:
make lintAI_PROVIDER=fake DEBUG=true docker compose up --buildIn another terminal (with the venv active):
source .venv/bin/activate
make smokeRecommended env: AI_PROVIDER=fake and DEBUG=true (needed for citation validation).
Eval runners also respect:
EVAL_READY_TIMEOUT_SECONDS(default: 60) for source ingest readiness.EVAL_HTTP_TIMEOUT_SECONDS(default: 30) for HTTP client timeouts.
- Start the stack:
AI_PROVIDER=fake DEBUG=true docker compose up --build
- Run the eval harness:
make eval - Run the verified eval harness:
make eval-verified
- Run the verified conflicts eval harness:
make eval-verified-conflicts
- Run the multisource eval harness:
make eval-multisource
- Run the OpenAI highlights smoke harness:
Requires
make eval-openai-smoke
AI_PROVIDER=openai,DEBUG=true, andOPENAI_API_KEY(exported or set in.env). Example:AI_PROVIDER=openai DEBUG=true OPENAI_API_KEY=... docker compose up --build -d make eval-openai-smoke
- Run the OpenAI verified smoke harness:
Requires
make eval-openai-verified-smoke
AI_PROVIDER=openai,DEBUG=true, andOPENAI_API_KEY(exported or set in.env). Example:AI_PROVIDER=openai DEBUG=true OPENAI_API_KEY=... docker compose up --build -d make eval-openai-verified-smoke
- Run the OpenAI verified contradictions smoke harness:
Requires
make eval-openai-verified-contradictions-smoke
AI_PROVIDER=openai,DEBUG=true, andOPENAI_API_KEY(exported or set in.env). Example:AI_PROVIDER=openai DEBUG=true OPENAI_API_KEY=... docker compose up --build -d make eval-openai-verified-contradictions-smoke
- Run the evidence integrity harness:
Requires
make eval-evidence-integrity
AI_PROVIDER=openai,DEBUG=true, andOPENAI_API_KEY(exported or set in.env). Example:AI_PROVIDER=openai DEBUG=true OPENAI_API_KEY=... docker compose up --build -d make eval-evidence-integrity
Thresholds live in scripts/eval/thresholds.json. Override them explicitly:
python3 scripts/eval/run_eval.py --thresholds scripts/eval/thresholds.json
python3 scripts/eval/run_eval_verified.py --thresholds scripts/eval/thresholds.jsonCI enforces the quality gates in scripts/eval/thresholds.json for all eval runs.
The conflicts dataset uses scripts/eval/golden_verified_conflicts.json and fixture
scripts/fixtures/conflicts.pdf with profile conflicts. Its thresholds live under
eval_verified_conflicts in scripts/eval/thresholds.json.
Conflict thresholds are selected when the dataset profile is conflicts or
eval_verified_conflicts, when the fixture is conflicts.pdf, or when the dataset
filename includes conflicts. If a conflicts profile is set without a fixture, the
runner defaults to conflicts.pdf.
The multisource dataset uses scripts/eval/golden_multisource.json with fixtures
scripts/fixtures/sample.pdf and scripts/fixtures/second.pdf. Its thresholds live
under eval_multisource in scripts/eval/thresholds.json.
The OpenAI highlights smoke dataset uses tests/eval/golden_openai_smoke.json and
scripts/fixtures/sample.pdf to validate highlight span invariants against stored
chunk text (OpenAI spans are validated against the truncated prefix but slices are
checked against the full chunk text).
The OpenAI verified smoke dataset uses tests/eval/golden_openai_verified_smoke.json and
scripts/fixtures/sample.pdf to validate verification_summary and answer_style invariants
for /query/verified and /query/verified/highlights (no exact answer matching).
The OpenAI verified contradictions smoke dataset uses
tests/eval/golden_openai_verified_contradictions_smoke.json with
scripts/fixtures/contradictions_smoke.pdf to validate conflict rewriting and prefix
behavior for verified endpoints.
Outputs are written to:
scripts/eval/out/eval_results.jsonscripts/eval/out/eval_report.mdscripts/eval/out/eval_verified_results.jsonscripts/eval/out/eval_verified_report.mdscripts/eval/out/eval_multisource_results.jsonscripts/eval/out/eval_multisource_report.mdtests/eval/out/eval_openai_smoke_results.jsontests/eval/out/eval_openai_smoke_report.mdtests/eval/out/eval_openai_verified_smoke_results.jsontests/eval/out/eval_openai_verified_smoke_report.mdtests/eval/out/eval_evidence_integrity_results.jsontests/eval/out/eval_evidence_integrity_report.md
OpenAI smoke outputs use fixed filenames and overwrite prior runs in tests/eval/out.
- Retrieval uses a lightweight reranker after hybrid search to boost relevance before
selecting the final chunks for RAG. It is enabled by default and can be disabled by
setting
RERANK_ENABLED=false. Fake provider runs deterministically for eval/smoke. - Grouped query endpoints apply source-aware retrieval when
PER_SOURCE_RETRIEVAL_LIMITis set andsource_idsare provided. - Ingestion happens asynchronously via Celery. Source status transitions:
UPLOADED→PROCESSING→READYorFAILED. - If a query cannot be answered with retrieved context, the API returns
insufficient evidencewith suggested follow-ups. /queryreturns answers with citations only;/query/verifiedadds claim-level verdicts and evidence snippets./query/verified/highlightsadds evidence highlight spans (start/end offsets and highlight_text) per evidence item./query/verified/grouped/highlightscombines highlights with grouped citations.- Verified responses include
verification_summarywith verdict counts and anoverall_verdict, plusanswer_style. Theanswer_stylefield is also mirrored insideverification_summaryand always matches the top-level value. If contradictions are detected, answers are prefixed with: "Contradictions detected in the source material." The body is rewritten into support/conflict/unsupported sections driven by claim verdicts. - Citation and evidence snippets include snippet_start/snippet_end offsets relative to full chunk text. When chunk char offsets are available, absolute_start/absolute_end provide offsets within the original source text; fields may be null if spans are unavailable.
- Verification runs deterministically when
AI_PROVIDER=fake. - Highlight spans are best-effort and refer to indices in the full chunk text stored for the source. Highlight spans remain claim-specific and are independent from snippet offsets.