markdown-pdf
pypandoc
markdown-pdf | pypandoc | |
---|---|---|
1 | 5 | |
10 | 827 | |
- | - | |
6.7 | 6.8 | |
13 days ago | about 1 month ago | |
Python | Python | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
markdown-pdf
-
Converting markdown to pdf in Python
This method is based on the use of the libraries markdown-it-py (conversion from markdown to html) and [PyMuPDF] https://github.com/pymupdf/PyMuPDF) (conversion from html to pdf). A small Python class links them together.
pypandoc
-
Web Scraping in Python – The Complete Guide
I recently used [0] Playwright for Python and [1] pypandoc to build a scraper that fetches a webpage and turns the content into sane markdown so that it can be passed into an AI coding chat [2].
They are both very gentle dependencies to add to a project. Both packages contain built in or scriptable methods to install their underlying platform-specific binary dependencies. This means you don't need to ask end users to use some complex, platform-specific package manager to install playwright and pandoc.
Playwright let's you scrape pages that rely on js. Pandoc is great at turning HTML into sensible markdown. Below is an excerpt of the openai pricing docs [3] that have been scraped to markdown [4] in this manner.
[0] https://playwright.dev/python/docs/intro
[1] https://github.com/JessicaTegner/pypandoc
[2] https://github.com/paul-gauthier/aider
[3] https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turb...
[4] https://gist.githubusercontent.com/paul-gauthier/95a1434a28d...
## GPT-4 and GPT-4 Turbo
- GitHub Accelerator: our first cohort and what's next
-
Converting multiple docx to multiple txt filed
Use Pypandoc
What are some alternatives?
markdown-it-py - Markdown parser, done right. 100% CommonMark support, extensions, syntax plugins & high speed. Now in Python!
taffy - A high performance rust-powered UI layout library
PyMuPDF - PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
sniffnet - Comfortably monitor your Internet traffic 🕵️♂️
formbricks - Open Source Survey Platform
nuxt - The Intuitive Vue Framework.
trpc - 🧙♀️ Move Fast and Break Nothing. End-to-end typesafe APIs made easy.
responsively-app - A modified web browser that helps in responsive web development. A web developer's must have dev-tool.
panflute - An Pythonic alternative to John MacFarlane's pandocfilters, with extra helper functions
codehike - Marvellous code walkthroughs
Seamly2D - Open source patternmaking software to democratize fashion.
pdf-highlights - Export your PDF highlights to markdown files.