Ruby Scraper

Open-source Ruby projects categorized as Scraper

Top 10 Ruby Scraper Projects

  • Huginn

    Create agents that monitor and act on your behalf. Your agents are standing by!

    Project mention: Ask HN: What is the correct way to deal with pipelines? | news.ycombinator.com | 2023-09-21

    "correct" is a value judgement that depends on lots of different things. Only you can decide which tool is correct. Here are some ideas:

    - https://camel.apache.org/

    - https://www.windmill.dev/

    - https://github.com/huginn/huginn

    Your idea about a queue (in redis, or postgres, or sqlite, etc) is also totally valid. These off-the-shelf tools I listed probably wouldn't give you a huge advantage IMO.

  • Wombat

    Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.

  • Onboard AI

    Learn any GitHub repo in 59 seconds. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at www.getonboard.dev.

  • kimuraframework

    Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites

    Project mention: Tanakai 1.6.0 (web scraping gem) has been released with support to Ruby 3+ | /r/ruby | 2023-02-16

    Tanakai intends to be a maintained fork of Kimurai, a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites.

  • spidr

    A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use. (by postmodern)

  • tanakai

    Tanakai is a modern web scraping framework written in Ruby. A fork of Kimurai.

    Project mention: Tanakai: Modern web scraping framework written in Ruby | news.ycombinator.com | 2023-10-25
  • html2rss

    📰 Build RSS 2.0 feeds from websites (and JSON APIs) with a few CSS selectors.

  • html2rss-web

    🕸 Generates and delivers RSS feeds via HTTP. Docker image available! Create your own feeds or get started quickly with the included configs.

    Project mention: Ask HN: What RSS Reader do you use in 2022? | news.ycombinator.com | 2022-12-23
  • InfluxDB

    Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.

  • nhkore

    :jp::newspaper::mount_fuji: NHK News Web (Easy) word frequency (core list) scraper for Japanese language learners.

  • rails-urltohtml

    A simple rails scrapper app to count html tags of a web page.

  • chanCrawler

    A simple gem that crawls chans and retrieves visual content

NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020). The latest post mention was on 2023-10-25.

Ruby Scraper related posts

Index

What are some of the best open-source Scraper projects in Ruby? This list will help you:

Project Stars
1 Huginn 39,866
2 Wombat 1,298
3 kimuraframework 990
4 spidr 770
5 tanakai 250
6 html2rss 102
7 html2rss-web 70
8 nhkore 13
9 rails-urltohtml 5
10 chanCrawler 4
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com