markdown-it-py
html2text
markdown-it-py | html2text | |
---|---|---|
2 | - | |
629 | 1,664 | |
2.2% | - | |
6.1 | 6.1 | |
about 5 hours ago | 15 days ago | |
Python | Python | |
MIT License | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
markdown-it-py
-
Converting markdown to pdf in Python
This method is based on the use of the libraries markdown-it-py (conversion from markdown to html) and [PyMuPDF] https://github.com/pymupdf/PyMuPDF) (conversion from html to pdf). A small Python class links them together.
-
Parsing a Markdown, search it and render back to Markdown
https://github.com/executablebooks/markdown-it-py is the recommended replacement which seems to have similar Tree features:
html2text
We haven't tracked posts mentioning html2text yet.
Tracking mentions began in Dec 2020.
What are some alternatives?
PyMuPDF - PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
textract - extract text from any document. no muss. no fuss.
markdown-pdf - Markdown to pdf renderer
trafilatura - Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Python-Markdown - A Python implementation of John Gruber’s Markdown with Extension support.
newspaper - newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
python-readability - fast python port of arc90's readability tool, updated to match latest readability.js!
python-goose - Html Content / Article Extractor, web scrapping lib in Python
sumy - Module for automatic summarization of text documents and HTML pages.
inscriptis - A python based HTML to text conversion library, command line client and Web service.
TWINT - An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
Haul - An Extensible Image Crawler