308 lines
6.8 KiB
Markdown
308 lines
6.8 KiB
Markdown
# CLI Reference
|
|
|
|
This document describes all available commands in the paperlib CLI.
|
|
|
|
## Global Options
|
|
|
|
All commands support these global options:
|
|
|
|
- `--help`, `-h`: Show help message
|
|
- `--version`: Show version information
|
|
|
|
Many commands also support:
|
|
- `--library`, `-L`: Specify library root directory (default: current directory)
|
|
- `--json`: Output machine-readable JSON instead of human-readable format
|
|
|
|
## Commands
|
|
|
|
### `paperlib init [PATH]`
|
|
|
|
Initialize a paper library directory structure.
|
|
|
|
**Arguments:**
|
|
- `PATH`: Directory to initialize (default: current directory)
|
|
|
|
**Examples:**
|
|
```bash
|
|
# Initialize library in current directory
|
|
paperlib init
|
|
|
|
# Initialize library in specific directory
|
|
paperlib init /path/to/my/papers
|
|
|
|
# Initialize and create parent directories
|
|
paperlib init ~/Documents/research/papers
|
|
```
|
|
|
|
**Behavior:**
|
|
- Creates standard directory structure (config/, papers/, db/, etc.)
|
|
- Safe to run multiple times (idempotent)
|
|
- Creates parent directories if they don't exist
|
|
|
|
---
|
|
|
|
### `paperlib import`
|
|
|
|
Import papers into the library from various sources.
|
|
|
|
**Required (one of):**
|
|
- `--pdf PATH`: Import a local PDF file
|
|
- `--arxiv ID`: Import paper from arXiv by ID or URL
|
|
|
|
**Options:**
|
|
- `--title TEXT`: Override paper title (for local PDFs)
|
|
- `--notes TEXT`: Add notes about the paper
|
|
- `--tags TAG1 TAG2`: Add tags to the paper
|
|
- `--library PATH`: Specify library directory
|
|
|
|
**Examples:**
|
|
```bash
|
|
# Import local PDF
|
|
paperlib import --pdf paper.pdf --title "My Research" --tags ml ai
|
|
|
|
# Import from arXiv
|
|
paperlib import --arxiv 2212.06340
|
|
|
|
# Import with arXiv URL
|
|
paperlib import --arxiv https://arxiv.org/abs/2212.06340
|
|
|
|
# Import to specific library
|
|
paperlib import --pdf paper.pdf --library ~/research
|
|
```
|
|
|
|
**Behavior:**
|
|
- Generates stable paper ID based on content (local) or arXiv ID
|
|
- Copies PDF to structured storage location
|
|
- Creates meta.json with paper metadata
|
|
- Prevents duplicate imports (same content/ID)
|
|
- Indexes paper in search database
|
|
|
|
---
|
|
|
|
### `paperlib list`
|
|
|
|
List all papers in the library with their current status.
|
|
|
|
**Options:**
|
|
- `--library PATH`: Specify library directory
|
|
- `--json`: Output in JSON format
|
|
|
|
**Examples:**
|
|
```bash
|
|
# List all papers
|
|
paperlib list
|
|
|
|
# List papers in specific library
|
|
paperlib list --library ~/research
|
|
|
|
# Get machine-readable output
|
|
paperlib list --json
|
|
```
|
|
|
|
**Output Format:**
|
|
```
|
|
Found 3 papers:
|
|
|
|
📄 arxiv-2212_06340
|
|
The new discontinuous Galerkin methods based numerical relativity program Nmesh
|
|
By: Wolfgang Tichy, Liwei Ji, Ananya Adhikari (+2 more)
|
|
Categories: gr-qc
|
|
|
|
⏳ local-a1b2c3d4e5f6
|
|
Machine Learning Applications in Physics
|
|
Categories: cs.AI, physics.comp-ph
|
|
```
|
|
|
|
**Status Indicators:**
|
|
- ⏳ Paper imported, conversion pending
|
|
- 📄 PDF converted to Markdown
|
|
- 📝 AI summary generated
|
|
- ❌ Conversion or processing failed
|
|
|
|
---
|
|
|
|
### `paperlib show PAPER_ID`
|
|
|
|
Show detailed information about a specific paper.
|
|
|
|
**Arguments:**
|
|
- `PAPER_ID`: The unique paper identifier
|
|
|
|
**Options:**
|
|
- `--library PATH`: Specify library directory
|
|
- `--json`: Output in JSON format
|
|
|
|
**Examples:**
|
|
```bash
|
|
# Show paper details
|
|
paperlib show arxiv-2212_06340
|
|
|
|
# Show with JSON output
|
|
paperlib show local-a1b2c3d4 --json
|
|
```
|
|
|
|
**Output includes:**
|
|
- All metadata fields
|
|
- Processing status
|
|
- File locations and existence
|
|
- Import timestamp
|
|
- Tags and notes
|
|
|
|
---
|
|
|
|
### `paperlib convert`
|
|
|
|
Convert papers from PDF to Markdown using MinerU.
|
|
|
|
**Options:**
|
|
- `--library PATH`: Specify library directory
|
|
- `--paper-id ID`: Convert specific paper only
|
|
- `--retry-failed`: Retry papers with failed conversion status
|
|
- `--force`: Force reconvert all papers (including successful ones)
|
|
- `--no-ui`: Disable rich UI display (useful for scripting)
|
|
|
|
**Examples:**
|
|
```bash
|
|
# Convert all pending papers (with rich UI)
|
|
paperlib convert
|
|
|
|
# Retry failed conversions
|
|
paperlib convert --retry-failed
|
|
|
|
# Force reconvert all papers
|
|
paperlib convert --force
|
|
|
|
# Convert specific paper
|
|
paperlib convert --paper-id arxiv-2212_06340
|
|
|
|
# Convert without UI (for scripts)
|
|
paperlib convert --no-ui
|
|
|
|
# Convert in specific library
|
|
paperlib convert --library ~/research
|
|
```
|
|
|
|
**Behavior:**
|
|
- Processes papers with `conversion_status: pending` (or failed with `--retry-failed`)
|
|
- Uses MinerU for PDF to Markdown conversion with CPU pipeline backend
|
|
- Shows rich UI with progress bar and live MinerU output (unless `--no-ui`)
|
|
- Updates metadata with conversion status
|
|
- Creates conversion logs in `logs/` directory
|
|
- Post-processes markdown to fix image references (`images/` → `assets/`)
|
|
- Handles conversion failures gracefully
|
|
|
|
**Rich UI Features:**
|
|
- Progress bar showing papers converted
|
|
- Live streaming of MinerU output
|
|
- Current paper being processed
|
|
- Color-coded output (errors in red, progress in blue, etc.)
|
|
|
|
---
|
|
|
|
### `paperlib reindex`
|
|
|
|
Rebuild the search index from stored paper metadata.
|
|
|
|
**Options:**
|
|
- `--library PATH`: Specify library directory
|
|
|
|
**Examples:**
|
|
```bash
|
|
# Rebuild index
|
|
paperlib reindex
|
|
|
|
# Rebuild index for specific library
|
|
paperlib reindex --library ~/research
|
|
```
|
|
|
|
**Behavior:**
|
|
- Clears existing SQLite database
|
|
- Scans all meta.json files in papers/ directory
|
|
- Rebuilds full-text search index
|
|
- Reports statistics on completion
|
|
- Safe to run anytime (repairs corrupted index)
|
|
|
|
---
|
|
|
|
### `paperlib status`
|
|
|
|
Show library configuration and layout information.
|
|
|
|
**Options:**
|
|
- `--library PATH`: Specify library directory
|
|
- `--json`: Output in JSON format
|
|
|
|
**Examples:**
|
|
```bash
|
|
# Show current library status
|
|
paperlib status
|
|
|
|
# Show specific library status
|
|
paperlib status --library ~/research
|
|
```
|
|
|
|
**Output:**
|
|
```
|
|
root: /home/user/papers
|
|
config: /home/user/papers/config/config.toml
|
|
database: /home/user/papers/db/paperlib.sqlite3
|
|
papers: /home/user/papers/papers
|
|
inbox: /home/user/papers/inbox
|
|
cache: /home/user/papers/cache
|
|
```
|
|
|
|
---
|
|
|
|
## Future Commands
|
|
|
|
These commands are planned but not yet implemented:
|
|
|
|
### `paperlib search QUERY`
|
|
Search papers by content and metadata.
|
|
|
|
### `paperlib summarize [PAPER_ID]`
|
|
Generate AI summaries for papers.
|
|
|
|
### `paperlib export FORMAT`
|
|
Export papers in various formats.
|
|
|
|
### `paperlib doctor`
|
|
Diagnose and repair library issues.
|
|
|
|
---
|
|
|
|
## Exit Codes
|
|
|
|
paperlib commands return standard exit codes:
|
|
|
|
- `0`: Success
|
|
- `1`: General error (file not found, invalid arguments, etc.)
|
|
- `2`: Command line argument error
|
|
|
|
## Configuration
|
|
|
|
paperlib looks for configuration in these locations (in order):
|
|
1. `$LIBRARY_ROOT/config/config.toml`
|
|
2. `~/.config/paperlib/config.toml`
|
|
3. Built-in defaults
|
|
|
|
## JSON Output Format
|
|
|
|
When using `--json`, commands output structured data suitable for programmatic consumption:
|
|
|
|
```json
|
|
{
|
|
"papers": [
|
|
{
|
|
"paper_id": "arxiv-2212_06340",
|
|
"title": "Example Paper",
|
|
"authors": ["Alice Smith", "Bob Jones"],
|
|
"conversion_status": "success",
|
|
"imported_at": "2024-01-15T10:30:00"
|
|
}
|
|
],
|
|
"total": 1
|
|
}
|
|
```
|
|
|
|
This format is stable across paperlib versions for reliable automation. |