scrapeghost
newsboat
scrapeghost | newsboat | |
---|---|---|
10 | 54 | |
1,396 | 2,799 | |
- | 0.9% | |
8.2 | 9.5 | |
5 months ago | 12 days ago | |
Python | C++ | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
scrapeghost
-
Those of you who have developed product features using GPT4 API (or failed to do so), how did it go?
Not my project but an ex-colleague has been having some success in this direction: https://jamesturk.github.io/scrapeghost/
-
What are the best tools for web scraping and analysis of natural language to populate a dataset?
Yes, there is something like that available - ScrapeGhost.
- FLaNK Stack Weekly 3 April 2023
- Scraping Websites Using GPT
-
@TwitterDev Announces New Twitter API Tiers
With AI scraping, tools can be far more resilient than soon enough to minor dom changes. See - https://jamesturk.github.io/scrapeghost/.
-
Experimental library for scraping websites using OpenAI's GPT API
Their ToS mentions scraping but it pertains to scraping their frontend instead of using their API, which they don't want you to do.
Also - this library requests the HTML by itself [0] and ships it as a prompt but with preset system messages as the instruction [1].
[0] - https://github.com/jamesturk/scrapeghost/blob/main/src/scrap...
[1] - https://github.com/jamesturk/scrapeghost/blob/main/src/scrap...
- scrapeghost. Web scrape using gpt-4 (experimental)
newsboat
-
RSS is still pretty great
If you're using https://newsboat.org, you can add a filter (killfile) to remedy this:
ignore-article "*" "title =~ \"#shorts\""
-
Open Thread: Weekend Edition #27 (Jun 2023)
I use newsboat.
-
Style Your RSS Feed
> Have you used any modern RSS reader recently like inoreader, they load the content of the page without visiting the publishing website.
I'm happy with newsboat[1]; but I'm not surprised that people have integrated scraping into RSS readers.
Fundamentally, that's not a problem with RSS, that's a war between scrapers and content providers. If the email newsletter model persists long enough, I'd expect that people will come out with "newsletter readers" that scrape websites too.
I'm not sure there's a good long-term solution to the problem. Aside from constant vigilance (obfuscation).
---
1. https://newsboat.org/
- [Open Source] Lecteur RSS multiplateforme
-
Following cricket scores from the terminal using Cricinfo’s RSS feeds
So, I installed a terminal RSS reader called Newsboat and added the feed to it. I have it always running in a terminal, and the scores refresh every minute. I can open the Cricinfo link in a browser by selecting a match and typing o.
-
Autoreload only some of the feeds
Not at the moment. There is an open Github issue asking for that feature, however, no idea if/when that will be implemented: https://github.com/newsboat/newsboat/issues/904
- FLaNK Stack Weekly 3 April 2023
-
Ad Blocking
Here's part of my newsboat config (works great for subscriptions):
-
Programs that don't work in Windows
Newsboat RSS / Feed reader
-
Libro di Tecnologia di mio cugino parla dei feed rss, una tecnologia molto utile che oggi però non esiste più.
Che rss feader usate? Io per il momento uso Newsboat su Desktop e Feeder su Mobile
What are some alternatives?
autoscraper - A Smart, Automatic, Fast and Lightweight Web Scraper for Python
nitter - Alternative Twitter front-end
tmx-solver - ThreatMetrix (anti-bot/fraud-detection) solver, deobfuscator & data harvester
elfeed - An Emacs web feeds client
wikipedia_ql - Query language for efficient data extraction from Wikipedia
Tiny-Tiny-RSS - A PHP and Ajax feed reader
Bandwhich - Terminal bandwidth utilization tool
RSS-Bridge - The RSS feed for websites missing it
bpytop - Linux/OSX/FreeBSD resource monitor
FreshRSS - A free, self-hostable news aggregator…
exiftool - ExifTool meta information reader/writer
rss-proxy - RSS-proxy allows you to do create an RSS or ATOM feed of almost any website, just by analyzing just the static HTML structure.