html-to-markdown (kreuzberg-dev/html-to-markdown) is an open-source AI project on GitHub. Repository summary: High performance and CommonMark compliant HTML to Markdown converter. Maintained by the Kreuzberg team. Kreuzberg is a fast, polyglot document intelligence engine with a Rust core. It extracts structured data from 56+ document formats using streaming parsers and built-in OCR. Its focus includes retrieval-augmented generation, team collaboration integrations. It is suitable for extension, integration, and iterative delivery in real workflows.
License
MIT
Stars
739
Features
- Core capability: High performance and CommonMark compliant HTML to Markdown converter. Maintained by the Kreuzberg team. Kreuzberg is a fast, polyglot document intelligence engine with a Rust core. It extracts structured data from 56+ document formats using streaming parsers and built-in OCR.
- Supports vector retrieval and retrieval-augmented reasoning
- Integrates with team collaboration and business systems
- Repository: kreuzberg-dev/html-to-markdown
- Primary language: HTML
- Open-source license: MIT
Use Cases
- Builds enterprise knowledge Q&A and document retrieval systems
- Used for team knowledge collaboration and task follow-ups
- Build internal AI workflow prototypes with html-to-markdown
- Validate html-to-markdown in production-like engineering scenarios
- Building AI development workflows
- Automating agent-based processes
FAQ
Teams should first define integration boundaries and call patterns, then map repository capabilities into concrete interfaces, parameters, and access rules. GitHub repository: https://github.com/kreuzberg-dev/html-to-markdown. Community traction is around 733 stars. License: MIT.
It usually works as an execution component or capability layer, with common deployment fits such as: Builds enterprise knowledge Q&A and document retrieval systems, Used for team knowledge collaboration and task follow-ups, Build internal AI workflow prototypes with html-to-markdown.