AI & Automation

Olmocr

Open Source Updated 40d ago

Toolkit for linearizing PDFs for LLM datasets/training

pip install olmocr

Did you build this?

Claim your listing to see exactly how many AI agents recommend this tool, your success rate, and more. Free, no commission, no fees.

Claim This Listing

Toolkit for linearising PDFs for LLM data pipelines — extracts structured text from complex PDFs for RAG and fine-tuning.

Save tools & get AI recommendations

Free forever. No credit card required.

Sign Up Free
Visit Website → ☆ Bookmark

Listed for free · No commission · Claim this listing

11 developers visited via IndieStack this month
𝕏 Share
ocraipdfdocumentopen-sourceallenailayout
View on GitHub ★ 17,062Python
Active — last commit 27d ago61 open issues
Using this saves ~80k tokens vs building from scratch
Something wrong? Log in to report.
Get weekly indie picks straight to your inbox