Show HN: RSS feeds for arbitrary websites using CSS selectors

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • RSSHub

    ๐Ÿงก Everything is RSSible

  • Related:

    https://github.com/DIYgod/RSSHub

    This perhaps has more flexibility and can deal with almost any website.

  • Ah, you mean the ones inside the contents? That's a good one. I'm not sure if that's easily fixable, but I'll give it some thought. For those interested, I'll track it here: https://gitlab.com/vincenttunru/feed-me-up-scotty/-/issues/1

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • Playwright

    Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

  • I'd recommend using the tools I used for this directly if you're looking to do this. Playwright in particular: https://playwright.dev

  • mlscraper

    ๐Ÿค– Scrape data from HTML websites automatically by just providing examples

  • In case anyone wants to detect the selectors automatically, here's a small python library I wrote that does it for you: https://github.com/lorey/mlscraper

  • Puts Debuggerer

    Ruby library for improved puts debugging, automatically displaying bonus useful information such as source line number and source code.

  • Ah, those instructions are unclear โ€” as far as I know, you first have to go to https://github.com//feeds/actions to enable Workflows for your repository. Then, your feeds should be published to https://.github.io/feeds/.xml.

    Does that work?

  • rssify

    Tool that generates an rss feed out of websites that don't have one

  • Since everyone is pitching their own, I built https://github.com/fran-penedo/rssify, which started as a fork of https://github.com/h43z/rssify. The basic functionality is similar to Vinnl's: give it a URL and some selectors and it builds the RSS feed. From this, I added a few things: templates (if you want to subscribe to individual projects within a webpage, like fanfics in ao3), transforms (when the data is not quite the text of the DOM element), a flask server you can use to add new URLs you have a template for and update the feeds, and a userscript to add the current URL using the server.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • rssify

    script that generates an rss feed out of websites that don't have one (by h43z)

  • Since everyone is pitching their own, I built https://github.com/fran-penedo/rssify, which started as a fork of https://github.com/h43z/rssify. The basic functionality is similar to Vinnl's: give it a URL and some selectors and it builds the RSS feed. From this, I added a few things: templates (if you want to subscribe to individual projects within a webpage, like fanfics in ao3), transforms (when the data is not quite the text of the DOM element), a flask server you can use to add new URLs you have a template for and update the feeds, and a userscript to add the current URL using the server.

  • subscriptions-digest

    Simple project to automate the generation of digest emails for personal subscriptions.

  • This could nicely supplement my GitHub automation that emails feed digests https://github.com/mhitza/subscriptions-digest

    Similarly to my repository, I think I would suggest the option to fetch the configuration file from an external resource defined via an action secret. For my automation I'm using a Gist (not sure if Gitlab has same thing; also private but publicly accessible snippets).

    At least that way you can keep your own feed configuration while allowing those that fork the repository to not have to manually fix conflicts within the feeds.toml config.

  • feedgen

    Generates RSS/ATOM/JSON feeds. Can be reasonably extended or create a feed using the CSS generator.

  • Kinda on a related note I found myself needing to make a bunch of these sorts of scraped feeds. The problem for me was the lack of date parsing support which I sorely needed.

    I ended up writing my own CLI tool that similarly supports CSS selectors for feed generation: https://github.com/dayzerosec/feedgen

    I did write it specifically for my use-case so there are some "warts" on it like custom generators for HackerOne and Google's Monorail bug tracker. But perhaps someone else might benefit from its ability to create slightly more complicated RSS, Atom, or JSON feeds.

    Example config with date parsing: https://github.com/dayzerosec/feedgen/blob/main/configs/bish...

  • furss

    Fix Up RSS (and atom): Make full-text versions of rss/atom feeds

  • My effort in this space is "furss", though it starts from an rss feed then aims to scrape the full article instead of an extract. https://github.com/jepler/furss

  • ttrss_plugin-feediron

    Evolution of ttrss_plugin-af_feedmod

  • Always good to see RSS projects pop up on hackernews. I'm still maintaining the Feediron plugin for TT-RSS - https://github.com/feediron/ttrss_plugin-feediron

    Unlike this project Feediron is only for modifying existing RSS feeds to extract the desired information. Typically uses xpaths to select content

  • track-changes

    Discontinued A JSON HTTP server that tracks other webpages to see if a certain query selector has changed

  • HungryHippo

    ๐Ÿฆ› scrapes websites and generates rss feeds

  • It seems that RSS feed generators are a bit like static site generators: it's often thought to be easier to make your own than to learn to use someone else's.

    Anyway, here's another self-hosted open source RSS feed generator for arbitrary websites: https://github.com/hueyy/HungryHippo

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts