Skip to content

Scriptwonder/scifig

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scifig - Scientific Figure with AI

Thin MCP server + CLI for generating academic paper figures via Gemini/OpenAI/Other models and matplotlib. Can be used with CC/CD and other AI models that could utilize CLI.

This implementation is inspired by PaperBanana from Google, but I find the repo difficult to use, so I adapt the framework into a more accessible pipeline where LLM can generate quickly during one conversation leveraging MCP/CLI.

Example: NLP pipeline diagram generated by scifig via Gemini
Method diagram generated in one prompt via scifig render-diagram (Gemini 3.1 Flash Image)

Same prompt, four models

Comparison: Gemini vs OpenAI vs Grok vs Flux on the same diagram prompt
Same NLP pipeline prompt sent to 4 providers.

What it does

Five primitives, each available as an MCP tool and a CLI subcommand:

Tool What it does
render_diagram_image Generate a diagram from a text prompt (Gemini gemini-3.1-flash-image-preview)
edit_diagram_image Refine an existing diagram with a follow-up instruction
render_plot_from_code Execute matplotlib code in a subprocess and return the figure
generate_diagram_variants Generate N variants from one prompt in parallel
compose_images Composite multiple images onto a canvas (PIL)

Plus two MCP resources with prompt-writing guidelines that Claude reads automatically.

Quick start

git clone https://github.com/scriptwonder/scifig.git
cd scifig
cp .env.example .env        # add your Google API key and other key if possible
pip install -e ".[dev]"

Get a free API key at https://aistudio.google.com/apikey and paste it into .env:

GOOGLE_API_KEY = "AIza..."

Use with Claude Code

Add to your project's .mcp.json:

{
  "mcpServers": {
    "scifig": {
      "command": "scifig-mcp"
    }
  }
}

Restart Claude Code. Then ask Claude:

"Generate a method diagram for a three-stage NLP pipeline using scifig"

Use from the command line

scifig render-diagram \
  --prompt "A three-stage ML pipeline: Input, Model, Output. Left-to-right, academic style." \
  --out pipeline.png

scifig edit-diagram --in pipeline.png --prompt "make the boxes blue" --out pipeline_v2.png

scifig render-plot \
  --code "import matplotlib.pyplot as plt; plt.bar(['A','B','C'], [1,2,3])" \
  --out chart.png

scifig variants --prompt "framework overview diagram" --n 4 --out-dir ./candidates/

scifig compose \
  --canvas 1600 800 \
  --layer pipeline.png 0.0 0.0 0.5 \
  --layer chart.png 0.5 0.0 0.5 \
  --out combined.png

Configuration

API key lookup order

  1. GOOGLE_API_KEY or GEMINI_API_KEY environment variable
  2. .env file in the project directory (loaded at import time)
  3. Platform config file at ~/.config/scifig/config.toml (Linux), ~/Library/Application Support/scifig/config.toml (macOS), or %APPDATA%\scifig\config.toml (Windows)

The .env approach is recommended for local development. For CI or production, use environment variables.

Security

Never put your API key in MCP client config files (claude_desktop_config.json, .mcp.json). Those files get synced, backed up, and logged. Set the key in your shell or .env instead.

Environment variables

Variable Description Default
GOOGLE_API_KEY Google AI Studio / Gemini API key (required)
GEMINI_API_KEY Alias for GOOGLE_API_KEY
SCIFIG_OUTPUT_DIR Where rendered images are saved scifig/output/
SCIFIG_INLINE_MAX_BYTES Max size for inline MCP responses 768000 (750 KB)

CLI reference

Shared image options

Option Default Choices
--aspect-ratio 16:9 1:1 2:3 3:2 3:4 4:3 4:5 5:4 9:16 16:9 21:9
--image-size 2K 1K 2K 4K

Subcommands

scifig render-diagram --prompt "..." [--prompt-file PATH] [--aspect-ratio] [--image-size] --out FILE
scifig edit-diagram   --in FILE --prompt "..." [--aspect-ratio] [--image-size] [--out FILE]
scifig render-plot    --code "..." [--code-file PATH] [--timeout 30] --out FILE
scifig variants       --prompt "..." --n N [--aspect-ratio] [--image-size] --out-dir DIR
scifig compose        --canvas W H --layer PATH X Y SCALE [...] [--background white] --out FILE

--prompt-file - and --code-file - read from stdin.

Architecture

src/scifig/
├── core/           # all business logic lives here
│   ├── diagram.py      # render / edit / variants via Gemini
│   ├── plot.py         # matplotlib subprocess execution
│   ├── compose.py      # PIL canvas compositing
│   ├── genai_client.py # lazy Gemini client + retry
│   └── output.py       # uniform RenderOutput type
├── mcp_server/     # FastMCP v3 — thin shell over core/
│   ├── server.py       # 5 tools + 2 resources
│   └── resources/      # prompt-writing guidelines (markdown)
├── cli/            # argparse — thin shell over core/
│   └── main.py         # 5 subcommands
└── config.py       # tiered key + path lookup

The MCP server and CLI share 100% of their logic through core/. Neither has branching beyond argument parsing.

Development

pip install -e ".[dev]"
pytest -v              # 58 tests, ~5 seconds

All Gemini API calls are mocked in tests. Plot and compose tests use real matplotlib and PIL (no API cost). No network access needed to run the test suite.

Tech stack

  • Python 3.11+ (stdlib tomllib, modern type syntax)
  • google-genai — Gemini image generation
  • FastMCP 3.2+ — MCP server framework
  • matplotlib — plot rendering in subprocess
  • Pillow — image composition
  • platformdirs — cross-platform config/cache paths

Inspired by

papervizagent (Apache-2.0) — Google's multi-agent paper figure pipeline. scifig takes a different approach: instead of bundling planner/critic/stylist agents inside the server, it delegates all reasoning to Claude and keeps the server thin.

The matplotlib subprocess worker (core/_plot_worker.py) is derived from papervizagent. See NOTICE for attribution.

License

Apache-2.0 — see LICENSE and NOTICE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages