SaaSHub helps you find the best software and product alternatives Learn more →
Top 16 news-aggregator Open-Source Projects
-
newspaper
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Why not provide a URL for your recommendation?
https://github.com/FreshRSS/FreshRSS
Project mention: Trafilatura: Python tool to gather text on the Web | news.ycombinator.com | 2023-08-14The feature list answers that question pretty well: https://github.com/adbar/trafilatura#features
Basically: you could implement all of this on top of BeautifulSoup - polite crawling policies, sitemap and feed parsing, URL de-duplication, parallel processing, download queues, heuristics for extracting just the main article content, metadata extraction, language detection... but it would require writing an enormous amount of extra code.
Project mention: What's the fun in writing on the internet anymore? | news.ycombinator.com | 2024-02-17https://hackernews.betacat.io/ here they use ChatGTP so summarize HN frontpage stories, and it says "Article discusses automated plagiarism and the diminishing value of authorship online. It compares today's internet to ancient texts, where authorship was less defined."
Project mention: Release 11.0 of Newspipe with a new dark theme – a news reader | news.ycombinator.com | 2024-02-27
Project mention: Ask HN: What are some excellent blogs, news boards, YouTube channels to follow? | news.ycombinator.com | 2023-12-06
Project mention: Ask HN: Do we need more users to browse /newest on HN? | news.ycombinator.com | 2023-07-16I think so. I work with a group that was researching ways to improve the hacker news ranking algorithm. We concluded that there were "false positives" and "false negatives" -- stories often did or did not make the home page based more on timing and luck than the actual content of the story. Read more here: https://github.com/social-protocols/news
However, we concluded that the biggest problem is false negatives -- good stories that never made the home page -- because of submissions to /newest that simply don't get enough *attention*. There are too many stories and not enough eyeballs. As a result, there is not enough data to know if a story would do well if it were shown on the front page or not.
You can see this from the fact that there are many stories that are submitted multiple times, and 4 out of 5 submissions get only 1 or two no upvotes, and then the 5th makes the front page and gets hundreds. Sometimes it's due to timing factors -- the story has become relevant for some reason -- but often its just luck.
news-aggregator related posts
-
Release 11.0 of Newspipe with a new dark theme – a news reader
-
Ask HN: What are some excellent blogs, news boards, YouTube channels to follow?
-
Ask HN: What are the best-designed news websites you’ve come across?
-
Ask HN: Do we need more users to browse /newest on HN?
-
What Deserves Our Attention?
-
Hacker News Ranking Algorithm: How would you have done it?
-
Infrequent top news RSS feed
-
A note from our sponsor - SaaSHub
www.saashub.com | 4 May 2024
Index
What are some of the best open-source news-aggregator projects? This list will help you:
Project | Stars | |
---|---|---|
1 | newspaper | 13,737 |
2 | FreshRSS | 8,399 |
3 | CryptoList | 4,127 |
4 | trafilatura | 2,853 |
5 | liferea | 798 |
6 | hacker-news-digest | 647 |
7 | newspipe | 406 |
8 | slowernews | 222 |
9 | realtime-newsapi | 172 |
10 | DailyFeed | 134 |
11 | JARR | 117 |
12 | newspaperjs | 70 |
13 | news | 43 |
14 | NextINpact-Unofficial | 16 |
15 | cabbage_news | 2 |
16 | morgentee.com | 2 |
Sponsored