Intelligent arXiv Literature Tracking System
π Language: δΈζζζ‘£
arXiv Pulse is a Python package for automated crawling, summarizing, and tracking of the latest research papers from arXiv. It supports all arXiv categories and provides a modern web interface for a professional literature management experience.
- π Web Interface: Modern FastAPI + Vue 3 + Element Plus interface with real-time SSE streaming
- π One-Command Start: Simply run
pulse serveto start the service - π Web Configuration: First-time setup wizard, all settings stored in database
- π€ AI Auto-Processing: Automatic translation, AI summarization, and figure extraction
- π¬ AI Chat Assistant: Ask questions about papers with context-aware AI assistant
- π Smart Search: Natural language queries with AI-powered keyword parsing
- π Paper Collections: Create, edit, and delete collections to organize important papers
- π Paper Basket: Select multiple papers for batch operations
- π Secure by Default: Localhost-only binding, explicit confirmation for remote access
- π Multilingual Support: UI in Chinese/English, translation to multiple languages
- Enhanced UI Components: Redesigned buttons, switches, selects, dialogs with refined shadows and transitions
- Paper Index Numbers: Visual index numbers on paper cards for easy reference
- Back-to-Top Button: Quick navigation with scroll-aware floating button
- Tooltips for Floating Buttons: Helpful labels on hover for all floating action buttons
- Recent Papers AI Search: Search within recent papers using natural language
- Sync Page Improvements: Better spacing, help icons with tooltips
- SQLite WAL Mode: Concurrent read/write operations for better performance
- Bug Fixes: Form submission, pagination visibility, index preservation during search
pip install arxiv-pulse# Create data directory
mkdir my_papers && cd my_papers
# Start web service (background mode by default)
pulse serve .
# Or specify port
pulse serve . --port 3000
# Foreground mode (see logs in terminal)
pulse serve . -fThen visit http://localhost:8000
pulse status . # Check service status
pulse stop . # Stop service
pulse restart . # Restart service
pulse stop . --force # Force stop (SIGKILL)By default, the service only accepts localhost connections for security. For remote access, use SSH tunnel:
# On server
pulse serve .
# On your computer
ssh -L 8000:localhost:8000 user@server
# Then visit http://localhost:8000This provides encrypted connection without exposing your API keys.
- Visit http://localhost:8000
- Follow the setup wizard:
- Step 1: Configure AI API (OpenAI/DeepSeek key, model, endpoint)
- Step 2: Select research fields
- Step 3: Set sync parameters
- Step 4: Start initial sync
arXiv Pulse is designed with security in mind:
- Localhost-only by default: Service binds to 127.0.0.1, inaccessible from external networks
- No plaintext credentials: API keys stored in local SQLite database, never transmitted
- Explicit remote access: Opening to non-localhost requires a flag with security warning
For remote access, we recommend:
- SSH Tunnel (easiest):
ssh -L 8000:localhost:8000 user@server - VPN: WireGuard, OpenVPN, or Tailscale
- Reverse Proxy: Nginx/Caddy with HTTPS
# If you must open to network (not recommended)
pulse serve . --host 0.0.0.0 --allow-non-localhost-access-with-plaintext-transmission-risk| Page | Description |
|---|---|
| Home | Statistics overview, search by natural language |
| Recent | Papers from last N days, filter by field |
| Sync | Sync status, field management, manual sync |
| Collections | Organize important papers into collections |
- Search: Use natural language like "DFT calculations for battery materials"
- Filter: Click "Filter Fields" to select research areas
- AI Chat: Click the chat icon (bottom-right) to ask questions
- Paper Basket: Click basket icon on cards to collect papers for batch operations
- Settings: Click gear icon to modify API key, language, and sync options
arxiv_pulse/
βββ core/ # Core infrastructure (Config, Database, Lock)
βββ models/ # SQLAlchemy ORM models
βββ services/ # Business logic (AI, translation, papers)
βββ crawler/ # ArXiv API crawler
βββ ai/ # Paper summarizer, report generator
βββ search/ # AI-powered search engine
βββ cli/ # Command-line interface
βββ web/ # FastAPI web application
β βββ app.py # FastAPI app
β βββ api/ # API endpoints
β βββ static/ # Vue 3 frontend (components, stores, i18n)
βββ i18n/ # Backend translations
Data Directory/
βββ data/arxiv_papers.db # SQLite database
βββ web.log # Service log
For detailed architecture, see DEV.md.
| Endpoint | Method | Description |
|---|---|---|
/api/config |
GET/PUT | Get/update configuration |
/api/config/status |
GET | Get initialization status |
/api/papers/search/stream |
GET (SSE) | AI-powered search |
/api/papers/recent/update |
POST (SSE) | Update recent papers |
/api/collections |
GET/POST | List/create collections |
/api/stats |
GET | Database statistics |
/api/chat/sessions/{id}/send |
POST (SSE) | Send message to AI |
arXiv Pulse supports all arXiv categories. Simply select your fields of interest in the Settings page. Pre-configured options include:
| Category | Example Fields |
|---|---|
| Physics | Condensed Matter, Quantum Physics, High Energy, Nuclear, Astrophysics |
| Computation | DFT, First-Principles, MD, Force Fields, Computational Physics |
| AI/ML | Machine Learning, Artificial Intelligence, Computer Vision, NLP |
| Chemistry | Quantum Chemistry, Chemical Physics |
| Math | Mathematical Physics, Numerical Analysis, Statistics |
| Others | Quantitative Biology, Electrical Engineering, Economics |
You can also add custom search queries for any topic on arXiv.
Q: Port already in use?
pulse serve . --port 3000Q: Service shows "not running" but port is occupied?
pulse stop . --force
# Or remove stale lock
rm .pulse.lockQ: How to reinitialize?
rm data/arxiv_papers.db
pulse serve .Q: AI not responding?
- Check API key in Settings
- Check console for errors (F12 β Console)
- Try foreground mode to see logs:
pulse serve . -f
GPL-3.0 - see LICENSE for details.
This project was developed by OpenCode, an AI coding agent.
- Yang Li - For 500+ iterations of requirements discussions, design decisions, and testing feedback. This project would not exist without your patience and vision.
- GLM-5 - For providing the core intelligence that powers OpenCode. ~200 million tokens consumed in bringing this project to life.
- arXiv.org - For the open API
- Computational materials science community - For inspiration and use cases
arXiv Pulse - Making arXiv literature tracking simple and efficient!
