84 lines
2.7 KiB
Markdown
84 lines
2.7 KiB
Markdown
# AGENTS.md
|
|
|
|
## Project overview
|
|
|
|
`paperlib` is a local-first paper library engine with a CLI.
|
|
|
|
**Key point**: `paperlib` is **not** primarily an AI app. AI summarization is optional enrichment. The project must remain useful without LLM configuration.
|
|
|
|
## Critical design principles
|
|
|
|
1. **Local-first**: User data lives locally. Prefer plain files + SQLite over opaque state.
|
|
2. **CLI-first**: The CLI is the primary interface. Python API is secondary.
|
|
3. **JSON files are source of truth**: Per-paper JSON files are durable truth. SQLite is rebuildable index/cache.
|
|
4. **AI is optional**: Core workflows (import/convert/index/list/show/search) work without AI.
|
|
5. **Machine-readable**: Commands support `--json` output for automation.
|
|
|
|
## Development commands
|
|
|
|
- **Testing**: `uv run pytest` (specific: `uv run pytest tests/test_models.py`)
|
|
- **Linting**: `uv run ruff check src/`
|
|
- **Formatting**: `uv run ruff format`
|
|
- **CLI testing**: `uv run paperlib --help` or `uv run paperlib init .tmp/test-lib`
|
|
|
|
**Always use `uv run` for Python commands. Use `./.tmp` for test libraries (it's tmpfs).**
|
|
|
|
## Current CLI commands
|
|
|
|
**Implemented**:
|
|
- `init` - Initialize library
|
|
- `status` - Show library config
|
|
- `list` - List papers
|
|
- `show` - Show paper details
|
|
- `search` - Search papers
|
|
- `import` - Import papers (PDF/arXiv)
|
|
- `convert` - Convert PDFs to Markdown (MinerU)
|
|
- `reindex` - Rebuild search index
|
|
|
|
**Planned**: `import-dir`, `watch`, `doctor`, `open`, `print-path`, `summarize`, `render-summary`, `export`
|
|
|
|
## Critical constraints
|
|
|
|
### What paperlib IS
|
|
- PDF import and local storage
|
|
- PDF → Markdown conversion
|
|
- Metadata files and search indexing
|
|
- CLI for all operations
|
|
- Optional AI summarization
|
|
|
|
### What paperlib is NOT
|
|
- Web UI or daemon
|
|
- Multi-user service
|
|
- Cloud-first design
|
|
- Vector database requirement
|
|
- Autonomous research assistant
|
|
|
|
### File format stability
|
|
Changes to `meta.json` or `summary.json` schemas are breaking changes. Must update schema version and consider migration.
|
|
|
|
### Module boundaries
|
|
- `search` should not depend on LLM code
|
|
- `import` should not require summarization
|
|
- `reindex` should work from files alone
|
|
- Keep AI behind clean interfaces
|
|
|
|
## Git commits
|
|
Format: `"<scope>: <subject>"` where scope is `feat|fix|docs|style|refactor|test|perf|update`
|
|
First line ≤88 chars, second line empty.
|
|
|
|
## When you need details
|
|
|
|
- **Architecture**: See `dev-docs/architecture.md`
|
|
- **Data model**: See `dev-docs/data-model.md`
|
|
- **AI integration**: See `dev-docs/ai-guidelines.md`
|
|
- **Code style**: See `dev-docs/coding-guidelines.md`
|
|
|
|
## Decision heuristics
|
|
|
|
When uncertain, prefer the option that is:
|
|
- more local-first
|
|
- more inspectable
|
|
- easier to test
|
|
- less coupled to AI
|
|
- more stable for scripts
|
|
- less magical |