Files
2026-04-17 20:04:32 -04:00

10 KiB

CLI Reference

This document describes all available commands in the paperlib CLI.

Global Options

All commands support these global options:

  • --help, -h: Show help message
  • --version: Show version information

Many commands also support:

  • --library, -L: Specify library root directory (default: current directory)
  • --json: Output machine-readable JSON instead of human-readable format

Commands

paperlib init [PATH]

Initialize a paper library directory structure.

Arguments:

  • PATH: Directory to initialize (default: current directory)

Examples:

# Initialize library in current directory
paperlib init

# Initialize library in specific directory
paperlib init /path/to/my/papers

# Initialize and create parent directories
paperlib init ~/Documents/research/papers

Behavior:

  • Creates standard directory structure (config/, papers/, db/, etc.)
  • Safe to run multiple times (idempotent)
  • Creates parent directories if they don't exist

paperlib import

Import papers into the library from various sources.

Required (one of):

  • --pdf PATH: Import a local PDF file
  • --arxiv ID: Import paper from arXiv by ID or URL

Options:

  • --title TEXT: Override paper title (for local PDFs)
  • --notes TEXT: Add notes about the paper
  • --tags TAG1 TAG2: Add tags to the paper
  • --library PATH: Specify library directory
  • --json: Output import results in JSON format for automation

Examples:

# Import local PDF
paperlib import --pdf paper.pdf --title "My Research" --tags ml ai

# Import from arXiv
paperlib import --arxiv 2212.06340

# Import with arXiv URL
paperlib import --arxiv https://arxiv.org/abs/2212.06340

# Import to specific library
paperlib import --pdf paper.pdf --library ~/research

# Import with JSON output for automation
paperlib import --arxiv 2212.06340 --json

Behavior:

  • Generates stable paper ID based on content (local) or arXiv ID
  • Copies PDF to structured storage location
  • Creates meta.json with paper metadata
  • Prevents duplicate imports (same content/ID)
  • Indexes paper in search database

paperlib list

List all papers in the library with their current status.

Options:

  • --library PATH: Specify library directory
  • --json: Output in JSON format

Examples:

# List all papers
paperlib list

# List papers in specific library
paperlib list --library ~/research

# Get machine-readable output
paperlib list --json

Output Format:

Found 3 papers:

📄 arxiv-2212_06340
   The new discontinuous Galerkin methods based numerical relativity program Nmesh
   By: Wolfgang Tichy, Liwei Ji, Ananya Adhikari (+2 more)
   Categories: gr-qc

⏳ local-a1b2c3d4e5f6
   Machine Learning Applications in Physics
   Categories: cs.AI, physics.comp-ph

Status Indicators:

  • Paper imported, conversion pending
  • 📄 PDF converted to Markdown
  • 📝 AI summary generated
  • Conversion or processing failed

paperlib show PAPER_ID

Show detailed information about a specific paper.

Arguments:

  • PAPER_ID: The unique paper identifier

Options:

  • --library PATH: Specify library directory
  • --json: Output in JSON format

Examples:

# Show paper details
paperlib show arxiv-2212_06340

# Show with JSON output
paperlib show local-a1b2c3d4 --json

Output includes:

  • All metadata fields
  • Processing status
  • File locations and existence
  • Import timestamp
  • Tags and notes

paperlib convert

Convert papers from PDF to Markdown using MinerU.

Options:

  • --library PATH: Specify library directory
  • --paper-id ID: Convert specific paper only
  • --retry-failed: Retry papers with failed conversion status
  • --force: Force reconvert all papers (including successful ones)
  • --no-ui: Disable rich UI display (useful for scripting)
  • --json: Output conversion results in JSON format (automatically disables UI)

Examples:

# Convert all pending papers (with rich UI)
paperlib convert

# Retry failed conversions
paperlib convert --retry-failed

# Force reconvert all papers
paperlib convert --force

# Convert specific paper
paperlib convert --paper-id arxiv-2212_06340

# Convert without UI (for scripts)
paperlib convert --no-ui

# Convert in specific library
paperlib convert --library ~/research

# Get JSON output for automation (disables UI automatically)
paperlib convert --json
paperlib convert --paper-id arxiv-2212_06340 --json

Behavior:

  • Processes papers with conversion_status: pending (or failed with --retry-failed)
  • Uses MinerU for PDF to Markdown conversion with CPU pipeline backend
  • Shows rich UI with progress bar and live MinerU output (unless --no-ui)
  • Updates metadata with conversion status
  • Creates conversion logs in logs/ directory
  • Post-processes markdown to fix image references (images/assets/)
  • Handles conversion failures gracefully

Rich UI Features:

  • Progress bar showing papers converted
  • Live streaming of MinerU output
  • Current paper being processed
  • Color-coded output (errors in red, progress in blue, etc.)

paperlib reindex

Rebuild the search index from stored paper metadata.

Options:

  • --library PATH: Specify library directory
  • --json: Output reindexing results and statistics in JSON format

Examples:

# Rebuild index
paperlib reindex

# Rebuild index for specific library
paperlib reindex --library ~/research

# Get JSON output with statistics
paperlib reindex --json

Behavior:

  • Clears existing SQLite database
  • Scans all meta.json files in papers/ directory
  • Rebuilds full-text search index
  • Reports statistics on completion
  • Safe to run anytime (repairs corrupted index)

paperlib status

Show library configuration and layout information.

Options:

  • --library PATH: Specify library directory
  • --json: Output in JSON format

Examples:

# Show current library status
paperlib status

# Show specific library status
paperlib status --library ~/research

# Get JSON output for automation
paperlib status --json

Output:

root: /home/user/papers
config: /home/user/papers/config/config.toml
database: /home/user/papers/db/paperlib.sqlite3
papers: /home/user/papers/papers
inbox: /home/user/papers/inbox
cache: /home/user/papers/cache

Future Commands

These commands are planned but not yet implemented:

paperlib search QUERY

Search papers by content and metadata.

paperlib summarize [PAPER_ID]

Generate AI summaries for papers.

paperlib export FORMAT

Export papers in various formats.

paperlib doctor

Diagnose and repair library issues.


Exit Codes

paperlib commands return standard exit codes:

  • 0: Success
  • 1: General error (file not found, invalid arguments, etc.)
  • 2: Command line argument error

Configuration

paperlib looks for configuration in these locations (in order):

  1. $LIBRARY_ROOT/config/config.toml
  2. ~/.config/paperlib/config.toml
  3. Built-in defaults

JSON Output Format

When using --json, commands output structured data suitable for programmatic consumption. All JSON responses follow a consistent envelope format with standard fields:

Standard Response Envelope

Success Response:

{
  "success": true,
  "timestamp": "2024-01-15T10:30:00.000Z",
  // Command-specific data fields below
}

Error Response:

{
  "success": false,
  "timestamp": "2024-01-15T10:30:00.000Z", 
  "error": "Error message here",
  "error_code": 1
}

Command-Specific JSON Formats

paperlib status --json

{
  "success": true,
  "timestamp": "2024-01-15T10:30:00.000Z",
  "library_root": "/home/user/papers",
  "config_path": "/home/user/papers/config/config.toml",
  "database_path": "/home/user/papers/db/paperlib.sqlite3",
  "papers_dir": "/home/user/papers/papers",
  "inbox_dir": "/home/user/papers/inbox",
  "cache_dir": "/home/user/papers/cache"
}

paperlib list --json

{
  "success": true,
  "timestamp": "2024-01-15T10:30:00.000Z",
  "papers": [
    {
      "paper_id": "arxiv-2212_06340",
      "source_type": "arxiv",
      "source_id": "2212.06340", 
      "title": "Example Paper",
      "authors": ["Alice Smith", "Bob Jones"],
      "published_date": "2022-12-06T00:00:00.000Z",
      "categories": ["cs.AI"],
      "conversion_status": "success",
      "summary_status": "pending",
      "imported_at": "2024-01-15T10:30:00.000Z",
      "tags": [],
      "notes": ""
    }
  ],
  "total": 1
}

paperlib show <paper_id> --json

{
  "success": true,
  "timestamp": "2024-01-15T10:30:00.000Z",
  "paper": {
    "paper_id": "arxiv-2212_06340",
    "source_type": "arxiv",
    "source_id": "2212.06340",
    "title": "Example Paper",
    "authors": ["Alice Smith", "Bob Jones"],
    "conversion_status": "success",
    "summary_status": "pending",
    "pdf_path": "papers/arxiv/2022/arxiv-2212_06340.pdf",
    "paper_md_path": "papers/arxiv/2022/arxiv-2212_06340.md",
    "files_status": {
      "pdf_exists": true,
      "markdown_exists": true,
      "summary_exists": false
    }
  }
}

paperlib import --json

{
  "success": true,
  "timestamp": "2024-01-15T10:30:00.000Z",
  "paper_id": "arxiv-2212_06340",
  "title": "Example Paper Title",
  "source_type": "arxiv",
  "source_id": "2212.06340",
  "authors": ["Alice Smith", "Bob Jones"],
  "message": "Successfully imported arXiv paper",
  "paper": {
    // Full paper metadata object
  }
}

paperlib convert --json

{
  "success": true,
  "timestamp": "2024-01-15T10:30:00.000Z",
  "action": "convert_pending",
  "success_count": 5,
  "failure_count": 1,
  "total_attempted": 6
}

For single paper conversion (--paper-id):

{
  "success": true,
  "timestamp": "2024-01-15T10:30:00.000Z",
  "paper_id": "arxiv-2212_06340",
  "conversion_success": true,
  "conversion_status": "success",
  "message": "Successfully converted paper"
}

paperlib reindex --json

{
  "success": true,
  "timestamp": "2024-01-15T10:30:00.000Z",
  "reindex_complete": true,
  "papers_indexed": 42,
  "errors": 1,
  "statistics": {
    "total_papers": 42,
    "by_source_type": {
      "arxiv": 38,
      "local": 4
    }
  }
}

JSON Data Types

  • Timestamps: Always in ISO 8601 format (YYYY-MM-DDTHH:mm:ss.sssZ)
  • Paper IDs: String identifiers (e.g., "arxiv-2212_06340", "local-a1b2c3d4")
  • Status Fields: String enums ("pending", "success", "failed")
  • Authors: Array of strings
  • Categories/Tags: Array of strings
  • File Paths: Relative to library root

This JSON format is stable across paperlib versions for reliable automation and scripting.