Thin MCP server + CLI for generating academic paper figures via Gemini/OpenAI/Other models and matplotlib. Can be used with CC/CD and other AI models that could utilize CLI.
This implementation is inspired by PaperBanana from Google, but I find the repo difficult to use, so I adapt the framework into a more accessible pipeline where LLM can generate quickly during one conversation leveraging MCP/CLI.
Method diagram generated in one prompt via scifig render-diagram (Gemini 3.1 Flash Image)
Same NLP pipeline prompt sent to 4 providers.
Five primitives, each available as an MCP tool and a CLI subcommand:
| Tool | What it does |
|---|---|
| render_diagram_image | Generate a diagram from a text prompt (Gemini gemini-3.1-flash-image-preview) |
| edit_diagram_image | Refine an existing diagram with a follow-up instruction |
| render_plot_from_code | Execute matplotlib code in a subprocess and return the figure |
| generate_diagram_variants | Generate N variants from one prompt in parallel |
| compose_images | Composite multiple images onto a canvas (PIL) |
Plus two MCP resources with prompt-writing guidelines that Claude reads automatically.
git clone https://github.com/scriptwonder/scifig.git
cd scifig
cp .env.example .env # add your Google API key and other key if possible
pip install -e ".[dev]"Get a free API key at https://aistudio.google.com/apikey and paste it into .env:
GOOGLE_API_KEY = "AIza..."
Add to your project's .mcp.json:
{
"mcpServers": {
"scifig": {
"command": "scifig-mcp"
}
}
}Restart Claude Code. Then ask Claude:
"Generate a method diagram for a three-stage NLP pipeline using scifig"
scifig render-diagram \
--prompt "A three-stage ML pipeline: Input, Model, Output. Left-to-right, academic style." \
--out pipeline.png
scifig edit-diagram --in pipeline.png --prompt "make the boxes blue" --out pipeline_v2.png
scifig render-plot \
--code "import matplotlib.pyplot as plt; plt.bar(['A','B','C'], [1,2,3])" \
--out chart.png
scifig variants --prompt "framework overview diagram" --n 4 --out-dir ./candidates/
scifig compose \
--canvas 1600 800 \
--layer pipeline.png 0.0 0.0 0.5 \
--layer chart.png 0.5 0.0 0.5 \
--out combined.pngGOOGLE_API_KEYorGEMINI_API_KEYenvironment variable.envfile in the project directory (loaded at import time)- Platform config file at
~/.config/scifig/config.toml(Linux),~/Library/Application Support/scifig/config.toml(macOS), or%APPDATA%\scifig\config.toml(Windows)
The .env approach is recommended for local development. For CI or production, use environment variables.
Never put your API key in MCP client config files (claude_desktop_config.json, .mcp.json). Those files get synced, backed up, and logged. Set the key in your shell or .env instead.
| Variable | Description | Default |
|---|---|---|
GOOGLE_API_KEY |
Google AI Studio / Gemini API key | (required) |
GEMINI_API_KEY |
Alias for GOOGLE_API_KEY |
— |
SCIFIG_OUTPUT_DIR |
Where rendered images are saved | scifig/output/ |
SCIFIG_INLINE_MAX_BYTES |
Max size for inline MCP responses | 768000 (750 KB) |
| Option | Default | Choices |
|---|---|---|
--aspect-ratio |
16:9 |
1:1 2:3 3:2 3:4 4:3 4:5 5:4 9:16 16:9 21:9 |
--image-size |
2K |
1K 2K 4K |
scifig render-diagram --prompt "..." [--prompt-file PATH] [--aspect-ratio] [--image-size] --out FILE
scifig edit-diagram --in FILE --prompt "..." [--aspect-ratio] [--image-size] [--out FILE]
scifig render-plot --code "..." [--code-file PATH] [--timeout 30] --out FILE
scifig variants --prompt "..." --n N [--aspect-ratio] [--image-size] --out-dir DIR
scifig compose --canvas W H --layer PATH X Y SCALE [...] [--background white] --out FILE
--prompt-file - and --code-file - read from stdin.
src/scifig/
├── core/ # all business logic lives here
│ ├── diagram.py # render / edit / variants via Gemini
│ ├── plot.py # matplotlib subprocess execution
│ ├── compose.py # PIL canvas compositing
│ ├── genai_client.py # lazy Gemini client + retry
│ └── output.py # uniform RenderOutput type
├── mcp_server/ # FastMCP v3 — thin shell over core/
│ ├── server.py # 5 tools + 2 resources
│ └── resources/ # prompt-writing guidelines (markdown)
├── cli/ # argparse — thin shell over core/
│ └── main.py # 5 subcommands
└── config.py # tiered key + path lookup
The MCP server and CLI share 100% of their logic through core/. Neither has branching beyond argument parsing.
pip install -e ".[dev]"
pytest -v # 58 tests, ~5 secondsAll Gemini API calls are mocked in tests. Plot and compose tests use real matplotlib and PIL (no API cost). No network access needed to run the test suite.
- Python 3.11+ (stdlib
tomllib, modern type syntax) - google-genai — Gemini image generation
- FastMCP 3.2+ — MCP server framework
- matplotlib — plot rendering in subprocess
- Pillow — image composition
- platformdirs — cross-platform config/cache paths
papervizagent (Apache-2.0) — Google's multi-agent paper figure pipeline. scifig takes a different approach: instead of bundling planner/critic/stylist agents inside the server, it delegates all reasoning to Claude and keeps the server thin.
The matplotlib subprocess worker (core/_plot_worker.py) is derived from papervizagent. See NOTICE for attribution.