Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →
Top 9 Python Readability Projects
-
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
-
textstat
:memo: python package to calculate readability statistics of a text object - paragraphs, sentences, articles.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
kindleServer
This project serve HTML files (and a few more) saved in your computer with a UI suitable for Kindle web browser. On top of that, it include a Read Mode (thanks to ReadabiliPy) to display the text in a comfortable size without have to use the 'Article Mode' in Kindle web browser.
-
scrapper
Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.
-
Neural-Scam-Artist
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
readability
How readable is your text? Provide a text input and get its grade level. Scientifically validated. (by public-law)
Project mention: Trafilatura: Python tool to gather text on the Web | news.ycombinator.com | 2023-08-14The feature list answers that question pretty well: https://github.com/adbar/trafilatura#features
Basically: you could implement all of this on top of BeautifulSoup - polite crawling policies, sitemap and feed parsing, URL de-duplication, parallel processing, download queues, heuristics for extracting just the main article content, metadata extraction, language detection... but it would require writing an enormous amount of extra code.
I need to learn how to use https://github.com/nilocesrom/readabilityanalyserde for a study I would like to run next semester. I haven't the slightest idea how to use anything related to programming and don't know where to start.
Python Readability related posts
-
I know nothing, but I gotta learn
-
Powerful and free scraper with a headless browser under the hood and Readability for parsing
-
How does Firefox's Reader View work?
-
Natural language processing with python
-
Show HN: Serve your saved articles to a Kindle
-
Question on easing comprehension
-
The "Connector" in main function?
-
A note from our sponsor - InfluxDB
www.influxdata.com | 9 May 2024
Index
What are some of the best open-source Readability projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | trafilatura | 2,898 |
2 | textstat | 1,078 |
3 | kindleServer | 154 |
4 | sspipe | 145 |
5 | scrapper | 107 |
6 | Neural-Scam-Artist | 23 |
7 | pypely | 16 |
8 | readability | 9 |
9 | ReadabilityAnalyserDE | 2 |
Sponsored