Skip to content

kYangLi/arXiv-Pulse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

285 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

arXiv Pulse

Intelligent arXiv Literature Tracking System

Version Python License

🌐 Language: δΈ­ζ–‡ζ–‡ζ‘£

arXiv Pulse is a Python package for automated crawling, summarizing, and tracking of the latest research papers from arXiv. It supports all arXiv categories and provides a modern web interface for a professional literature management experience.

πŸ“Έ Screenshots

English Interface

✨ Key Features

  • 🌐 Web Interface: Modern FastAPI + Vue 3 + Element Plus interface with real-time SSE streaming
  • πŸš€ One-Command Start: Simply run pulse serve to start the service
  • πŸ“ Web Configuration: First-time setup wizard, all settings stored in database
  • πŸ€– AI Auto-Processing: Automatic translation, AI summarization, and figure extraction
  • πŸ’¬ AI Chat Assistant: Ask questions about papers with context-aware AI assistant
  • πŸ” Smart Search: Natural language queries with AI-powered keyword parsing
  • πŸ“ Paper Collections: Create, edit, and delete collections to organize important papers
  • πŸ›’ Paper Basket: Select multiple papers for batch operations
  • πŸ”’ Secure by Default: Localhost-only binding, explicit confirmation for remote access
  • 🌍 Multilingual Support: UI in Chinese/English, translation to multiple languages

πŸ†• What's New in 1.2.0

  • Enhanced UI Components: Redesigned buttons, switches, selects, dialogs with refined shadows and transitions
  • Paper Index Numbers: Visual index numbers on paper cards for easy reference
  • Back-to-Top Button: Quick navigation with scroll-aware floating button
  • Tooltips for Floating Buttons: Helpful labels on hover for all floating action buttons
  • Recent Papers AI Search: Search within recent papers using natural language
  • Sync Page Improvements: Better spacing, help icons with tooltips
  • SQLite WAL Mode: Concurrent read/write operations for better performance
  • Bug Fixes: Form submission, pagination visibility, index preservation during search

πŸš€ Quick Start

Installation

pip install arxiv-pulse

Start Service

# Create data directory
mkdir my_papers && cd my_papers

# Start web service (background mode by default)
pulse serve .

# Or specify port
pulse serve . --port 3000

# Foreground mode (see logs in terminal)
pulse serve . -f

Then visit http://localhost:8000

Service Management

pulse status .          # Check service status
pulse stop .            # Stop service
pulse restart .         # Restart service
pulse stop . --force    # Force stop (SIGKILL)

Remote Access (SSH Tunnel)

By default, the service only accepts localhost connections for security. For remote access, use SSH tunnel:

# On server
pulse serve .

# On your computer
ssh -L 8000:localhost:8000 user@server

# Then visit http://localhost:8000

This provides encrypted connection without exposing your API keys.

First-Time Setup

  1. Visit http://localhost:8000
  2. Follow the setup wizard:
    • Step 1: Configure AI API (OpenAI/DeepSeek key, model, endpoint)
    • Step 2: Select research fields
    • Step 3: Set sync parameters
    • Step 4: Start initial sync

πŸ”’ Security

arXiv Pulse is designed with security in mind:

  • Localhost-only by default: Service binds to 127.0.0.1, inaccessible from external networks
  • No plaintext credentials: API keys stored in local SQLite database, never transmitted
  • Explicit remote access: Opening to non-localhost requires a flag with security warning

For remote access, we recommend:

  1. SSH Tunnel (easiest): ssh -L 8000:localhost:8000 user@server
  2. VPN: WireGuard, OpenVPN, or Tailscale
  3. Reverse Proxy: Nginx/Caddy with HTTPS
# If you must open to network (not recommended)
pulse serve . --host 0.0.0.0 --allow-non-localhost-access-with-plaintext-transmission-risk

πŸ“– Daily Usage

Pages

Page Description
Home Statistics overview, search by natural language
Recent Papers from last N days, filter by field
Sync Sync status, field management, manual sync
Collections Organize important papers into collections

Features

  • Search: Use natural language like "DFT calculations for battery materials"
  • Filter: Click "Filter Fields" to select research areas
  • AI Chat: Click the chat icon (bottom-right) to ask questions
  • Paper Basket: Click basket icon on cards to collect papers for batch operations
  • Settings: Click gear icon to modify API key, language, and sync options

πŸ“ Project Structure

arxiv_pulse/
β”œβ”€β”€ core/                   # Core infrastructure (Config, Database, Lock)
β”œβ”€β”€ models/                 # SQLAlchemy ORM models
β”œβ”€β”€ services/               # Business logic (AI, translation, papers)
β”œβ”€β”€ crawler/                # ArXiv API crawler
β”œβ”€β”€ ai/                     # Paper summarizer, report generator
β”œβ”€β”€ search/                 # AI-powered search engine
β”œβ”€β”€ cli/                    # Command-line interface
β”œβ”€β”€ web/                    # FastAPI web application
β”‚   β”œβ”€β”€ app.py             # FastAPI app
β”‚   β”œβ”€β”€ api/               # API endpoints
β”‚   └── static/            # Vue 3 frontend (components, stores, i18n)
└── i18n/                   # Backend translations

Data Directory/
β”œβ”€β”€ data/arxiv_papers.db    # SQLite database
└── web.log                 # Service log

For detailed architecture, see DEV.md.

πŸ”§ API Endpoints

Endpoint Method Description
/api/config GET/PUT Get/update configuration
/api/config/status GET Get initialization status
/api/papers/search/stream GET (SSE) AI-powered search
/api/papers/recent/update POST (SSE) Update recent papers
/api/collections GET/POST List/create collections
/api/stats GET Database statistics
/api/chat/sessions/{id}/send POST (SSE) Send message to AI

πŸ§ͺ Research Fields

arXiv Pulse supports all arXiv categories. Simply select your fields of interest in the Settings page. Pre-configured options include:

Category Example Fields
Physics Condensed Matter, Quantum Physics, High Energy, Nuclear, Astrophysics
Computation DFT, First-Principles, MD, Force Fields, Computational Physics
AI/ML Machine Learning, Artificial Intelligence, Computer Vision, NLP
Chemistry Quantum Chemistry, Chemical Physics
Math Mathematical Physics, Numerical Analysis, Statistics
Others Quantitative Biology, Electrical Engineering, Economics

You can also add custom search queries for any topic on arXiv.

πŸ› Troubleshooting

Q: Port already in use?

pulse serve . --port 3000

Q: Service shows "not running" but port is occupied?

pulse stop . --force
# Or remove stale lock
rm .pulse.lock

Q: How to reinitialize?

rm data/arxiv_papers.db
pulse serve .

Q: AI not responding?

  • Check API key in Settings
  • Check console for errors (F12 β†’ Console)
  • Try foreground mode to see logs: pulse serve . -f

πŸ“„ License

GPL-3.0 - see LICENSE for details.

πŸ™ Acknowledgments

This project was developed by OpenCode, an AI coding agent.

  • Yang Li - For 500+ iterations of requirements discussions, design decisions, and testing feedback. This project would not exist without your patience and vision.
  • GLM-5 - For providing the core intelligence that powers OpenCode. ~200 million tokens consumed in bringing this project to life.
  • arXiv.org - For the open API
  • Computational materials science community - For inspiration and use cases

arXiv Pulse - Making arXiv literature tracking simple and efficient!

About

Automated Daily ArXiv Paper Scraper & Summarizer tracks the latest papers on arXiv. It automatically retrieves new submissions and provides intelligent summaries, helping researchers stay updated effortlessly.

Resources

License

Stars

Watchers

Forks

Contributors