Quickstart, API, CLI, Docker, and self-host.
Four ways to use Lyte Lab: the hosted HTTP API, the doc_app CLI, the Python client, or a self-hosted Docker deploy.
Free tier — no account
Hit any free-tier endpoint without a token. Rate limit: 20 requests / minute per IP.
curl -X POST https://api.lytelab.ai/v1/parse \
-F "file=@paper.pdf" | jq .structural.domain_hint.primary_domain
Paid endpoints — bearer token
Create an account, buy credits, generate a token in the dashboard, and set it in your environment.
export LYTELAB_TOKEN=ll_live_...
curl -X POST https://api.lytelab.ai/v1/ocr \
-H "Authorization: Bearer $LYTELAB_TOKEN" \
-F "file=@scan.pdf" \
-o searchable.pdf
Free endpoints
POST /v1/parse— PDF to structured JSONPOST /v1/edit— overlay edit from a parsed-JSON diffPOST /v1/convert— PDF ↔ Markdown / HTML / DOCX / CSVPOST /v1/merge,POST /v1/splitPOST /v1/diff,POST /v1/search,POST /v1/classify
Paid endpoints
POST /v1/ocr— scanned PDF to searchable PDF (1 credit / page)POST /v1/pii— PII detection with bounding boxes (1 credit / page)POST /v1/redact— content-stream purge of regions (1 credit / page)POST /v1/tables— extract tables as CSV, cross-page joined (1 credit / page)POST /v1/bates— Bates stamping with production log (1 credit / page)POST /v1/bulk-fill— AcroForm + coord-mode bulk fill (1 credit / output doc)POST /v1/classifier-rename— page classifier + batch rename (1 credit / page)POST /v1/compress— lossless compress + optional PDF/A-2b (1 credit / page)POST /v1/sign— native tamper-detection (5 credits) or?profile=pades(15 credits)POST /v1/password— AES-256 encrypt / decrypt / permissions (1 credit / doc)POST /v1/overlay— header, footer, watermark (1 credit / page)
Async jobs
Heavy operations return {"job_id": "...", "status": "queued"}. Poll GET /v1/jobs/<job_id> until "status": "complete". The final response includes the output file URL, the pages processed, and the credits consumed.
Account and billing
GET /v1/balance— current credit balancePOST /v1/checkout?package=credits_500— returns a Stripe Checkout URLGET /v1/usage?from=...&to=...— credit usage per operator
Install the package to get the doc_app command on your PATH.
pip install doc-app
# parse to JSON
doc_app parse input.pdf --out parsed.json
# OCR a scanned PDF
doc_app ocr scan.pdf --out searchable.pdf
# convert formats
doc_app convert report.pdf --target md --out report.md
# extract tables
doc_app tables filings.pdf --out tables.csv
# redact regions defined in a spec
doc_app redact raw.pdf --redaction-spec spec.json --out redacted.pdf
# Bates-stamp a set of PDFs
doc_app bates *.pdf --prefix ACME --start 1 --width 6
# bulk form fill
doc_app bulk-fill template.pdf --rows applicants.csv --out-dir filled/
# sign (native tamper-detection)
doc_app sign contract.pdf --reason "Counsel review" --out signed.pdf
# corpus ops
doc_app merge a.pdf b.pdf --out both.pdf
doc_app split long.pdf --pages "1-3,5" --out short.pdf
doc_app diff old.pdf new.pdf --out diff.txt
doc_app search "indemnity" contracts/*.pdf
Local web UI
For quick one-off conversions on your own machine:
doc_app ui --port 7860
# open http://localhost:7860
from doc_app.parse import run_pipeline, parse_structural
from doc_app.overlay import apply_edits
from doc_app.redact import redact
from doc_app.ocr import detect_scanned, ocr_pdf
from doc_app.format_adapters import convert
from doc_app.tables import pdf_to_tables_csv
from doc_app.corpus_ops import merge_pdfs, split_pdf, diff_pdfs, search_pdfs
pdf_bytes = open("paper.pdf", "rb").read()
# parse
result = run_pipeline(pdf_bytes)
print(result["structural"]["domain_hint"]["primary_domain"])
# edit + re-render (overlay path, source stream preserved)
structural = parse_structural(pdf_bytes)
for b in structural["blocks"]:
if b["block_type"] == "title":
b["content"]["text"] = "New Title"
edited = apply_edits(pdf_bytes, structural)
# convert formats
md_bytes = convert(pdf_bytes, "pdf", "md")
Self-host is available under the enterprise license (see pricing). The license file unlocks the container and the Helm chart.
# 1. Unpack the release bundle you received with your license.
tar xf lytelab-doc_app-release.tar.gz
cd doc_app
# 2. Copy and fill out the env template.
cp deploy/.env.example .env
# edit .env — at minimum: DATABASE_URL, REDIS_URL, AUTH_SECRET_KEY,
# and LYTELAB_LICENSE_FILE pointing at the signed .lic file.
# 3. Launch the full stack: web + worker + postgres + redis.
docker compose -f deploy/docker-compose.prod.yml --env-file .env up -d
# 4. Smoke test.
curl http://localhost:8000/healthz
For Kubernetes, deploy/helm/ contains a chart with values for OIDC / SAML, RBAC, and audit-log export. See local-first for the air-gapped notes.
- Max request body: 25 MB on free tier, 100 MB on paid.
- Free tier: 20 req/min per IP. Paid: 60 req/min per user.
- Heavy operations are queued; expect 2–30 s per page for OCR depending on document complexity and backend.
- Requests are processed in memory and deleted after the response — see privacy.
- Docling cold-start on a fresh install is roughly 1 GB of layout-model download on first parse.
Bug reports with a reproducible repro go to the top of the queue. Feature requests and enterprise questions are welcome.