
Journaux liées à cette note :
Journal du mercredi 14 mai 2025 à 11:48
Un collègue m'a partagé le projet Marker (https://github.com/VikParuchuri/marker) :
Marker converts documents to markdown, JSON, and HTML quickly and accurately.
- Converts PDF, image, PPTX, DOCX, XLSX, HTML, EPUB files in all languages
- Formats tables, forms, equations, inline math, links, references, and code blocks
- Extracts and saves images
- Removes headers/footers/other artifacts
- Extensible with your own formatting and logic
- Optionally boost accuracy with LLMs
- Works on GPU, CPU, or MPS
Voici comment fonctionne Marker :