wyj/paperlib

Files

T

wyj e870fe280a update: update the dev-docs for AI agent

2026-04-17 19:54:24 -04:00

1.5 KiB

Raw Permalink Blame History

Architecture Guidelines

The codebase should be organized around a few clear layers.

1. Core domain logic

Pure Python logic for:

identifying papers
computing paths
importing PDFs
updating metadata
converting PDFs to Markdown
rendering summaries
rebuilding the index

This layer should be testable without the CLI.

2. CLI layer

Thin wrappers around the core domain logic.

The CLI should:

parse arguments
call core functions
format output
handle exit codes

The CLI should not contain deep business logic.

3. Optional integrations

External systems should live in integration modules, for example:

MinerU wrapper
filesystem watch integration
ripgrep integration
LLM provider integration

Keep these adapters isolated.

4. Optional AI layer

The AI summarization layer should be behind a stable abstraction.

For example:

load prompt template
load paper markdown
load optional profile / vocabulary
call provider
validate structured output
write summary.json
render summary.md

Avoid leaking provider-specific behavior into unrelated modules.

Component boundaries

Avoid hidden coupling:

search should not depend on LLM code
import should not require summarization
reindex should not assume a specific converter
render-summary should not require calling AI again

Prefer explicit data flow:

import creates or updates metadata
convert creates paper.md
summarize creates summary.json
render-summary creates summary.md
reindex rebuilds SQLite from files