Documentation

Quickstart, API, CLI, Docker, and self-host.

Four ways to use Lyte Lab: the hosted HTTP API, the doc_app CLI, the Python client, or a self-hosted Docker deploy.

Quickstart 60 seconds to first response

Free tier — no account

Hit any free-tier endpoint without a token. Rate limit: 20 requests / minute per IP.

curl -X POST https://api.lytelab.ai/v1/parse \
  -F "file=@paper.pdf" | jq .structural.domain_hint.primary_domain

Paid endpoints — bearer token

Create an account, buy credits, generate a token in the dashboard, and set it in your environment.

export LYTELAB_TOKEN=ll_live_...

curl -X POST https://api.lytelab.ai/v1/ocr \
  -H "Authorization: Bearer $LYTELAB_TOKEN" \
  -F "file=@scan.pdf" \
  -o searchable.pdf

HTTP API api.lytelab.ai / v1

Free endpoints

POST /v1/parse — PDF to structured JSON
POST /v1/edit — overlay edit from a parsed-JSON diff
POST /v1/convert — PDF ↔ Markdown / HTML / DOCX / CSV
POST /v1/merge, POST /v1/split
POST /v1/diff, POST /v1/search, POST /v1/classify

Paid endpoints

POST /v1/ocr — scanned PDF to searchable PDF (1 credit / page)
POST /v1/pii — PII detection with bounding boxes (1 credit / page)
POST /v1/redact — content-stream purge of regions (1 credit / page)
POST /v1/tables — extract tables as CSV, cross-page joined (1 credit / page)
POST /v1/bates — Bates stamping with production log (1 credit / page)
POST /v1/bulk-fill — AcroForm + coord-mode bulk fill (1 credit / output doc)
POST /v1/classifier-rename — page classifier + batch rename (1 credit / page)
POST /v1/compress — lossless compress + optional PDF/A-2b (1 credit / page)
POST /v1/sign — native tamper-detection (5 credits) or ?profile=pades (15 credits)
POST /v1/password — AES-256 encrypt / decrypt / permissions (1 credit / doc)
POST /v1/overlay — header, footer, watermark (1 credit / page)

Async jobs

Heavy operations return {"job_id": "...", "status": "queued"}. Poll GET /v1/jobs/<job_id> until "status": "complete". The final response includes the output file URL, the pages processed, and the credits consumed.

Account and billing

GET /v1/balance — current credit balance
POST /v1/checkout?package=credits_500 — returns a Stripe Checkout URL
GET /v1/usage?from=...&to=... — credit usage per operator

Command line pip install doc-app

Install the package to get the doc_app command on your PATH.

pip install doc-app

# parse to JSON
doc_app parse input.pdf --out parsed.json

# OCR a scanned PDF
doc_app ocr scan.pdf --out searchable.pdf

# convert formats
doc_app convert report.pdf --target md --out report.md

# extract tables
doc_app tables filings.pdf --out tables.csv

# redact regions defined in a spec
doc_app redact raw.pdf --redaction-spec spec.json --out redacted.pdf

# Bates-stamp a set of PDFs
doc_app bates *.pdf --prefix ACME --start 1 --width 6

# bulk form fill
doc_app bulk-fill template.pdf --rows applicants.csv --out-dir filled/

# sign (native tamper-detection)
doc_app sign contract.pdf --reason "Counsel review" --out signed.pdf

# corpus ops
doc_app merge a.pdf b.pdf --out both.pdf
doc_app split long.pdf --pages "1-3,5" --out short.pdf
doc_app diff old.pdf new.pdf --out diff.txt
doc_app search "indemnity" contracts/*.pdf

Local web UI

For quick one-off conversions on your own machine:

doc_app ui --port 7860
# open http://localhost:7860

Python client

from doc_app.parse import run_pipeline, parse_structural
from doc_app.overlay import apply_edits
from doc_app.redact import redact
from doc_app.ocr import detect_scanned, ocr_pdf
from doc_app.format_adapters import convert
from doc_app.tables import pdf_to_tables_csv
from doc_app.corpus_ops import merge_pdfs, split_pdf, diff_pdfs, search_pdfs

pdf_bytes = open("paper.pdf", "rb").read()

# parse
result = run_pipeline(pdf_bytes)
print(result["structural"]["domain_hint"]["primary_domain"])

# edit + re-render (overlay path, source stream preserved)
structural = parse_structural(pdf_bytes)
for b in structural["blocks"]:
    if b["block_type"] == "title":
        b["content"]["text"] = "New Title"
edited = apply_edits(pdf_bytes, structural)

# convert formats
md_bytes = convert(pdf_bytes, "pdf", "md")

Docker self-host docker compose up

Self-host is available under the enterprise license (see pricing). The license file unlocks the container and the Helm chart.

# 1. Unpack the release bundle you received with your license.
tar xf lytelab-doc_app-release.tar.gz
cd doc_app

# 2. Copy and fill out the env template.
cp deploy/.env.example .env
# edit .env — at minimum: DATABASE_URL, REDIS_URL, AUTH_SECRET_KEY,
# and LYTELAB_LICENSE_FILE pointing at the signed .lic file.

# 3. Launch the full stack: web + worker + postgres + redis.
docker compose -f deploy/docker-compose.prod.yml --env-file .env up -d

# 4. Smoke test.
curl http://localhost:8000/healthz

For Kubernetes, deploy/helm/ contains a chart with values for OIDC / SAML, RBAC, and audit-log export. See local-first for the air-gapped notes.

Limits and behavior

Max request body: 25 MB on free tier, 100 MB on paid.
Free tier: 20 req/min per IP. Paid: 60 req/min per user.
Heavy operations are queued; expect 2–30 s per page for OCR depending on document complexity and backend.
Requests are processed in memory and deleted after the response — see privacy.
Docling cold-start on a fresh install is roughly 1 GB of layout-model download on first parse.

Support trevin@lytelab.ai

Bug reports with a reproducible repro go to the top of the queue. Feature requests and enterprise questions are welcome.