Highlighted · Python

document-intelligence-pipeline

Document ingestion, parsing, chunking, and embedding pipeline.

Multi-format document ingestion pipeline: PDF, DOCX, HTML parsing, semantic chunking, and pgvector embedding storage

Language: Python Updated: 2026-06-16 Visibility: public

Repo facts

Default branch: main. Stars: 0. Forks: 0. Created: 2026-06-09.