Highlighted · Python
document-intelligence-pipeline
Document ingestion, parsing, chunking, and embedding pipeline.
Multi-format document ingestion pipeline: PDF, DOCX, HTML parsing, semantic chunking, and pgvector embedding storage
Language: Python
Updated: 2026-06-16
Visibility: public
Why it matters
Document ingestion, parsing, chunking, and embedding pipeline.
Repo facts
Default branch: main. Stars: 0. Forks: 0. Created: 2026-06-09.