10 KiB
CLI Reference
This document describes all available commands in the paperlib CLI.
Global Options
All commands support these global options:
--help,-h: Show help message--version: Show version information
Many commands also support:
--library,-L: Specify library root directory (default: current directory)--json: Output machine-readable JSON instead of human-readable format
Commands
paperlib init [PATH]
Initialize a paper library directory structure.
Arguments:
PATH: Directory to initialize (default: current directory)
Examples:
# Initialize library in current directory
paperlib init
# Initialize library in specific directory
paperlib init /path/to/my/papers
# Initialize and create parent directories
paperlib init ~/Documents/research/papers
Behavior:
- Creates standard directory structure (config/, papers/, db/, etc.)
- Safe to run multiple times (idempotent)
- Creates parent directories if they don't exist
paperlib import
Import papers into the library from various sources.
Required (one of):
--pdf PATH: Import a local PDF file--arxiv ID: Import paper from arXiv by ID or URL
Options:
--title TEXT: Override paper title (for local PDFs)--notes TEXT: Add notes about the paper--tags TAG1 TAG2: Add tags to the paper--library PATH: Specify library directory--json: Output import results in JSON format for automation
Examples:
# Import local PDF
paperlib import --pdf paper.pdf --title "My Research" --tags ml ai
# Import from arXiv
paperlib import --arxiv 2212.06340
# Import with arXiv URL
paperlib import --arxiv https://arxiv.org/abs/2212.06340
# Import to specific library
paperlib import --pdf paper.pdf --library ~/research
# Import with JSON output for automation
paperlib import --arxiv 2212.06340 --json
Behavior:
- Generates stable paper ID based on content (local) or arXiv ID
- Copies PDF to structured storage location
- Creates meta.json with paper metadata
- Prevents duplicate imports (same content/ID)
- Indexes paper in search database
paperlib list
List all papers in the library with their current status.
Options:
--library PATH: Specify library directory--json: Output in JSON format
Examples:
# List all papers
paperlib list
# List papers in specific library
paperlib list --library ~/research
# Get machine-readable output
paperlib list --json
Output Format:
Found 3 papers:
📄 arxiv-2212_06340
The new discontinuous Galerkin methods based numerical relativity program Nmesh
By: Wolfgang Tichy, Liwei Ji, Ananya Adhikari (+2 more)
Categories: gr-qc
⏳ local-a1b2c3d4e5f6
Machine Learning Applications in Physics
Categories: cs.AI, physics.comp-ph
Status Indicators:
- ⏳ Paper imported, conversion pending
- 📄 PDF converted to Markdown
- 📝 AI summary generated
- ❌ Conversion or processing failed
paperlib show PAPER_ID
Show detailed information about a specific paper.
Arguments:
PAPER_ID: The unique paper identifier
Options:
--library PATH: Specify library directory--json: Output in JSON format
Examples:
# Show paper details
paperlib show arxiv-2212_06340
# Show with JSON output
paperlib show local-a1b2c3d4 --json
Output includes:
- All metadata fields
- Processing status
- File locations and existence
- Import timestamp
- Tags and notes
paperlib convert
Convert papers from PDF to Markdown using MinerU.
Options:
--library PATH: Specify library directory--paper-id ID: Convert specific paper only--retry-failed: Retry papers with failed conversion status--force: Force reconvert all papers (including successful ones)--no-ui: Disable rich UI display (useful for scripting)--json: Output conversion results in JSON format (automatically disables UI)
Examples:
# Convert all pending papers (with rich UI)
paperlib convert
# Retry failed conversions
paperlib convert --retry-failed
# Force reconvert all papers
paperlib convert --force
# Convert specific paper
paperlib convert --paper-id arxiv-2212_06340
# Convert without UI (for scripts)
paperlib convert --no-ui
# Convert in specific library
paperlib convert --library ~/research
# Get JSON output for automation (disables UI automatically)
paperlib convert --json
paperlib convert --paper-id arxiv-2212_06340 --json
Behavior:
- Processes papers with
conversion_status: pending(or failed with--retry-failed) - Uses MinerU for PDF to Markdown conversion with CPU pipeline backend
- Shows rich UI with progress bar and live MinerU output (unless
--no-ui) - Updates metadata with conversion status
- Creates conversion logs in
logs/directory - Post-processes markdown to fix image references (
images/→assets/) - Handles conversion failures gracefully
Rich UI Features:
- Progress bar showing papers converted
- Live streaming of MinerU output
- Current paper being processed
- Color-coded output (errors in red, progress in blue, etc.)
paperlib reindex
Rebuild the search index from stored paper metadata.
Options:
--library PATH: Specify library directory--json: Output reindexing results and statistics in JSON format
Examples:
# Rebuild index
paperlib reindex
# Rebuild index for specific library
paperlib reindex --library ~/research
# Get JSON output with statistics
paperlib reindex --json
Behavior:
- Clears existing SQLite database
- Scans all meta.json files in papers/ directory
- Rebuilds full-text search index
- Reports statistics on completion
- Safe to run anytime (repairs corrupted index)
paperlib status
Show library configuration and layout information.
Options:
--library PATH: Specify library directory--json: Output in JSON format
Examples:
# Show current library status
paperlib status
# Show specific library status
paperlib status --library ~/research
# Get JSON output for automation
paperlib status --json
Output:
root: /home/user/papers
config: /home/user/papers/config/config.toml
database: /home/user/papers/db/paperlib.sqlite3
papers: /home/user/papers/papers
inbox: /home/user/papers/inbox
cache: /home/user/papers/cache
Future Commands
These commands are planned but not yet implemented:
paperlib search QUERY
Search papers by content and metadata.
paperlib summarize [PAPER_ID]
Generate AI summaries for papers.
paperlib export FORMAT
Export papers in various formats.
paperlib doctor
Diagnose and repair library issues.
Exit Codes
paperlib commands return standard exit codes:
0: Success1: General error (file not found, invalid arguments, etc.)2: Command line argument error
Configuration
paperlib looks for configuration in these locations (in order):
$LIBRARY_ROOT/config/config.toml~/.config/paperlib/config.toml- Built-in defaults
JSON Output Format
When using --json, commands output structured data suitable for programmatic consumption. All JSON responses follow a consistent envelope format with standard fields:
Standard Response Envelope
Success Response:
{
"success": true,
"timestamp": "2024-01-15T10:30:00.000Z",
// Command-specific data fields below
}
Error Response:
{
"success": false,
"timestamp": "2024-01-15T10:30:00.000Z",
"error": "Error message here",
"error_code": 1
}
Command-Specific JSON Formats
paperlib status --json
{
"success": true,
"timestamp": "2024-01-15T10:30:00.000Z",
"library_root": "/home/user/papers",
"config_path": "/home/user/papers/config/config.toml",
"database_path": "/home/user/papers/db/paperlib.sqlite3",
"papers_dir": "/home/user/papers/papers",
"inbox_dir": "/home/user/papers/inbox",
"cache_dir": "/home/user/papers/cache"
}
paperlib list --json
{
"success": true,
"timestamp": "2024-01-15T10:30:00.000Z",
"papers": [
{
"paper_id": "arxiv-2212_06340",
"source_type": "arxiv",
"source_id": "2212.06340",
"title": "Example Paper",
"authors": ["Alice Smith", "Bob Jones"],
"published_date": "2022-12-06T00:00:00.000Z",
"categories": ["cs.AI"],
"conversion_status": "success",
"summary_status": "pending",
"imported_at": "2024-01-15T10:30:00.000Z",
"tags": [],
"notes": ""
}
],
"total": 1
}
paperlib show <paper_id> --json
{
"success": true,
"timestamp": "2024-01-15T10:30:00.000Z",
"paper": {
"paper_id": "arxiv-2212_06340",
"source_type": "arxiv",
"source_id": "2212.06340",
"title": "Example Paper",
"authors": ["Alice Smith", "Bob Jones"],
"conversion_status": "success",
"summary_status": "pending",
"pdf_path": "papers/arxiv/2022/arxiv-2212_06340.pdf",
"paper_md_path": "papers/arxiv/2022/arxiv-2212_06340.md",
"files_status": {
"pdf_exists": true,
"markdown_exists": true,
"summary_exists": false
}
}
}
paperlib import --json
{
"success": true,
"timestamp": "2024-01-15T10:30:00.000Z",
"paper_id": "arxiv-2212_06340",
"title": "Example Paper Title",
"source_type": "arxiv",
"source_id": "2212.06340",
"authors": ["Alice Smith", "Bob Jones"],
"message": "Successfully imported arXiv paper",
"paper": {
// Full paper metadata object
}
}
paperlib convert --json
{
"success": true,
"timestamp": "2024-01-15T10:30:00.000Z",
"action": "convert_pending",
"success_count": 5,
"failure_count": 1,
"total_attempted": 6
}
For single paper conversion (--paper-id):
{
"success": true,
"timestamp": "2024-01-15T10:30:00.000Z",
"paper_id": "arxiv-2212_06340",
"conversion_success": true,
"conversion_status": "success",
"message": "Successfully converted paper"
}
paperlib reindex --json
{
"success": true,
"timestamp": "2024-01-15T10:30:00.000Z",
"reindex_complete": true,
"papers_indexed": 42,
"errors": 1,
"statistics": {
"total_papers": 42,
"by_source_type": {
"arxiv": 38,
"local": 4
}
}
}
JSON Data Types
- Timestamps: Always in ISO 8601 format (
YYYY-MM-DDTHH:mm:ss.sssZ) - Paper IDs: String identifiers (e.g.,
"arxiv-2212_06340","local-a1b2c3d4") - Status Fields: String enums (
"pending","success","failed") - Authors: Array of strings
- Categories/Tags: Array of strings
- File Paths: Relative to library root
This JSON format is stable across paperlib versions for reliable automation and scripting.