HungryHippo
mlscraper
HungryHippo | mlscraper | |
---|---|---|
2 | 10 | |
46 | 1,229 | |
- | - | |
5.4 | 0.6 | |
4 months ago | about 2 months ago | |
TypeScript | Python | |
GNU General Public License v3.0 or later | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
HungryHippo
-
Show HN: RSS feeds for arbitrary websites using CSS selectors
It seems that RSS feed generators are a bit like static site generators: it's often thought to be easier to make your own than to learn to use someone else's.
Anyway, here's another self-hosted open source RSS feed generator for arbitrary websites: https://github.com/hueyy/HungryHippo
-
Looking for a website changes monitor plus notifications.
HungryHippo generates RSS feeds for websites that don't have one. If the website you're trying to monitor is public it might be supported already. If not, you can always send in a pull request or open an issue. It can be hosted with a single docker image, so it's quite straightforward.
mlscraper
-
What are the best tools for web scraping and analysis of natural language to populate a dataset?
See if something like autoscraper or mlscraper suits your needs.
-
Experimental library for scraping websites using OpenAI's GPT API
Why GPT-based then? There are libraries that do this: You give examples, they generate the rules for you and give you a scraper object that takes any html and returns the scraped data.
Mine: https://github.com/lorey/mlscraper
-
Could someone recommend me a library for c# like one of these two (they are for python) : mlscraper and autoscraper
GitHub - lorey/mlscraper: ๐ค Scrape data from HTML websites automatically by just providing examples
-
Smart Scraper
Check it out here: https://github.com/lorey/mlscraper Example: https://github.com/lorey/mlscraper/blob/master/examples/quotes\_to\_scrape.py
- Pre-trained Webscraping Models
- ๐ค Scrape data from HTML websites automatically by just providing examples
- mlscraper: Scrape data from HTML pages automatically with Machine Learning
-
Show HN: RSS feeds for arbitrary websites using CSS selectors
In case anyone wants to detect the selectors automatically, here's a small python library I wrote that does it for you: https://github.com/lorey/mlscraper
What are some alternatives?
feed-me-up-scotty
scrapingant-client-python - ScrapingAnt API client for Python.
feedgen - Generates RSS/ATOM/JSON feeds. Can be reasonably extended or create a feed using the CSS generator.
ttrss_plugin-feediron - Evolution of ttrss_plugin-af_feedmod
rssify - script that generates an rss feed out of websites that don't have one
furss - Fix Up RSS (and atom): Make full-text versions of rss/atom feeds
RSSHub - ๐งก Everything is RSSible
urlwatch - Watch (parts of) webpages and get notified when something changes via e-mail, on your phone or via other means. Highly configurable.
rssify - Tool that generates an rss feed out of websites that don't have one
telegram-to-rss - Telegram Bot to generate an RSS feed from group messages