mbfc_crawler
kimuraframework
mbfc_crawler | kimuraframework | |
---|---|---|
1 | 5 | |
16 | 1,000 | |
- | - | |
0.5 | 0.0 | |
almost 4 years ago | 9 months ago | |
Ruby | Ruby | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mbfc_crawler
-
Media Bias/Fact Check datasets or APIs?
Haven't tried it but I found a crawler that should create a json for the data at https://github.com/JeffreyATW/mbfc_crawler. I may look into other data sources in the coming weeks, will update if I see anything decent.
kimuraframework
-
Tanakai 1.6.0 (web scraping gem) has been released with support to Ruby 3+
Tanakai intends to be a maintained fork of Kimurai, a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites.
-
Headless Browser for Web Scraping: Usage Features
Kimurai is a Web Scraping framework for Ruby with headless browser functionality. Supported browsers: Chromium and Firefox. Supported programming languages: Ruby.
-
Long life to Tanakai, a fork of Kimurai (a modern web scraping framework written in Ruby)
I find Kimurai quite useful, it's sad to see it without any support for more than 2 years though, that's why I've decided to fork it.
-
Web scraping with rails
I've worked with https://github.com/vifreefly/kimuraframework in the past which was delightful.
-
10 Best Open Source Web Scraping Tools
Here is how simple it is to work with infinite scroll web pages Link https://github.com/vifreefly/kimuraframework
What are some alternatives?
manga2pdf - Simple Ruby script to download manga and merge the images into a single pdf file. Available with both CLI and GUI.
football_api - A Ruby interface to the https://www.api-football.com.
tanakai - Tanakai is a modern web scraping framework written in Ruby. A fork of Kimurai.
Nokogiri - Nokogiri (鋸) makes it easy and painless to work with XML and HTML from Ruby.
spidr - A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Playwright - Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
apparition - Capybara driver for Chrome using CDP
puppeteer - Node.js API for Chrome
ferrum - Headless Chrome Ruby API
flatfish
cheerio - The fast, flexible, and elegant library for parsing and manipulating HTML and XML.