Top 23 Python News Projects

newspaper

13 13,703 0.0 Python

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
Stream-Framework

34 4,718 0.0 Python

Stream Framework is a Python library, which allows you to build news feed, activity streams and notification systems using Cassandra and/or Redis. The authors of Stream-Framework also provide a cloud service for feed technology:

Project mention: Recommendations for an external messenger integration/API? | /r/rails | 2023-10-30

I have looked into a getstream.io integration, however it seems that the Ruby SDK is really treated as a second class citizen. There's bugs with the documented API (I'm having issues even creating users and querying users), the usage of the gem is low and there is an open issue since May that no one has even looked at, which doesn't give me hope for long term support.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
trafilatura

13 2,740 8.4 Python

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

Project mention: Trafilatura: Python tool to gather text on the Web | news.ycombinator.com | 2023-08-14

The feature list answers that question pretty well: https://github.com/adbar/trafilatura#features
Basically: you could implement all of this on top of BeautifulSoup - polite crawling policies, sitemap and feed parsing, URL de-duplication, parallel processing, download queues, heuristics for extracting just the main article content, metadata extraction, language detection... but it would require writing an enormous amount of extra code.

news-please

1 1,925 7.3 Python

news-please - an integrated web crawler and information extractor for news that just works
pygooglenews

8 1,230 0.0 Python

If Google News had a Python library
GNews

26 532 7.0 Python

A Happy and lightweight Python Package that Provides an API to search for articles on Google News and returns a JSON response.

Project mention: Need some help with my personal project (interactive world map with real-time data) | /r/datascience | 2023-05-15

The web crawling part wasn't much of an issue - I am using an existing API (https://pypi.org/project/gnews/) which does what I needed. The issue lies in, well, pretty much the rest of the task described above. I need to create an interactive world map with real-time data (news articles) - more specifically, maintaining the data server, figuring out the data mapping part, etc. Since I pretty much have no experience in this, I would like to ask you guys for some directions. What tool would I need to use and how would I store/load the data? Is it possible to do so without writing some Javascript code myself?

GoogleNews

1 306 6.0 Python

Script for GoogleNews

Project mention: Python script that opens my bookmarks and returns only links posted in the last 14 days | /r/learnpython | 2023-05-07

Another option you could consider would be using a wrapper library around google news if you struggle with implementing the scarping logic yourself. The downside is that you'll still have to be careful so your IP doesn't get blocked. Make sure you limit the amount of requests per second/minute...

InfluxDB

www.influxdata.com sponsored

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
hn_summary

8 240 4.1 Python

Summarizes top stories from Hacker News using a large language model and post them to a Telegram channel.

Project mention: Generative AI Market Analysis: People Love to Cum | news.ycombinator.com | 2023-09-19

interesting, GPT refuses to summarize this content: "I'm sorry, but I can't generate a summary for that content." per https://github.com/jiggy-ai/hn_summary & https://t.me/hn_summary

django-newsfeed

1 193 2.6 Python

A news curator and newsletter subscription package for Django
archiveis

1,477 170 0.0 Python

A simple Python wrapper for the archive.is capturing service

Project mention: Ask HN: Comments requesting paywall bypass links | news.ycombinator.com | 2024-04-18

I frequently see comments from people explicitly or implicitly asking for links to bypass the paywall on submitted articles. I'm confused by this, since it takes about the same amount of effort to generate your own paywall bypassing link as it does to post a comment asking for someone else to do it. Going further and posting this link for others to use does add a step, but doesn't seem like a lot to ask.
What's happening here?
Do these posters think some special magic is required? Are they not aware that creating such a link just involves going to the top level domain of one of the services (eg, http://archive.is) and pasting the URL into a form?
Are they opposed to the idea of creating such a link themselves, either due to moral qualms or legal fears, but willing a follow a link that some else has created?
Are they using a handheld device that makes it so hard to copy a URL and open a new page that they don't know how to start, whereas they know how to write a comment?
Or are they just so entitled that they think someone else should provide for them at all times, and don't want to demean themselves helping others?
Can anyone who has posted such requests tell me what they were thinking? Can others who post bypass links tell me other explanations? General discussion on what the HN etiquette on paywall bypass links should be is welcomed as well.

news-fetch

1 166 0.0 Python

A Python Package which helps to scrape all news details from any news websites
savepagenow

456 164 7.0 Python

A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service
newsnotfound

3 138 7.7 Python

Entire source code for NewsNotFound's article generation process ✍

Project mention: Speaking of AI image and text storytelling, there's now an AI-powered news website... | /r/behindthebastards | 2023-06-23

NewsNotFound website

JARVIS-GUI

12 69 5.2 Python

Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user using python.
newsemble

12 44 0.0 Python

API for fetching data from news websites.
wallabag-kindle-consumer

1 40 2.9 Python

Send all articles with a certain tag to your kindle.
nepstonks

4 22 8.2 Python

An automated bot that scrapes the latest upcoming issues, news, and investment opportunities that are announced inside Nepal and sends them to a telegram channel.
JapanDailyNews

1 14 9.7 Python
ailive

2 12 5.1 Python

AI Revolution
pressReadMePlease

1 10 6.1 Python

PressReader 🐍 automation for mobile apps auth token
python-client

2 8 5.2 Python

Newsdata.io Official Python Client (by bytesview)

Project mention: newscatcher VS python-client - a user suggested alternative | libhunt.com/r/newscatcher | 2024-02-09

NewsData provides more feature than Newscatcher

YourDailyRundownBackend

1 3 8.3 Python

Flask-based backend for YourDailyRundown.

Project mention: YourDailyRundown | /r/SideProject | 2023-09-10

pastevents

4 3 10.0 Python

A structured, searchable archive of Wikipedia's "Current Events" portal

Project mention: 68k.news: Basic HTML Google News for Vintage Computers | news.ycombinator.com | 2023-06-16

I share the frustration with the major online news portals, and have in fact built my own portal powered by Wikipedia[1].
But eventually I realized that my biggest gripe with news today isn't the presentation but the content. And I'm not talking about biases or sensationalism – I'm talking about the news items themselves.
Much of what passes as news today is stuff like "15 people die when a copper mine collapses in Chile". I'm trying to get a big picture view of the world, and I don't believe that such stories are at all conducive to that endeavor. News as we know it is just an endless stream of random events, apparently selected according to a handful of crude criteria, the most important one being dead people. I've been a keen follower of global news for many years, and I don't feel that I'm understanding anything.
Where are the truly novel approaches to painting a picture of what the world is today? Where are the quantitative news portals, the event pattern search engines, the automatically derived trends? I'm still looking.
[1] https://pastevents.org

SaaSHub

www.saashub.com sponsored

SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Python News related posts

Ask HN: Comments requesting paywall bypass links
1 project | news.ycombinator.com | 18 Apr 2024
Feathers Are One of Evolution's Cleverest Inventions
1 project | news.ycombinator.com | 18 Apr 2024
What will humans do if technology solves everything?
1 project | news.ycombinator.com | 14 Apr 2024
One Satellite Signal Rules Modern Life. What If Someone Knocks It Out?
1 project | news.ycombinator.com | 28 Mar 2024
"Dune" and the Delicate Art of Making Fictional Languages
1 project | news.ycombinator.com | 29 Feb 2024
newscatcher VS python-client - a user suggested alternative
2 projects | 9 Feb 2024
Stockman: The Destruction Of The American Middle Class
1 project | /r/economy | 11 Dec 2023
A note from our sponsor - WorkOS
workos.com | 23 Apr 2024

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →

Index

What are some of the best open-source News projects in Python? This list will help you:

	Project	Stars
1	newspaper	13,703
2	Stream-Framework	4,718
3	trafilatura	2,740
4	news-please	1,925
5	pygooglenews	1,230
6	GNews	532
7	GoogleNews	306
8	hn_summary	240
9	django-newsfeed	193
10	archiveis	170
11	news-fetch	166
12	savepagenow	164
13	newsnotfound	138
14	JARVIS-GUI	69
15	newsemble	44
16	wallabag-kindle-consumer	40
17	nepstonks	22
18	JapanDailyNews	14
19	ailive	12
20	pressReadMePlease	10
21	python-client	8
22	YourDailyRundownBackend	3
23	pastevents	3