Nokogiri
ferrum
Our great sponsors
Nokogiri | ferrum | |
---|---|---|
20 | 9 | |
6,100 | 1,642 | |
0.1% | 2.7% | |
9.5 | 8.5 | |
5 days ago | 28 days ago | |
C | Ruby | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Nokogiri
- Web Scraping in Python – The Complete Guide
-
Did you know Nokogiri now has opt-in HTML5 parsing?
release planning: v1.16.0 · Issue #2897 · sparklemotion/nokogiri
-
As a Go developer, I’m surprised Crystal isn’t more popular
What's holding me back from going all in with Crystal is I have a lot of pre-existing Ruby code, and porting Ruby code to Crystal can be tricky. For example, Crystal lacks an Enumerator class (aka generators) due to captured block semantics. I also wish the shards ecosystem was a little more mature; for example there's multiple HTML parsing libraries, but none have all of the features that Ruby's Nokogiri has. For new greenfield backend projects, I would totally use Crystal.
-
Two months into learning Ruby, it is the most beautiful language I ever learned
Welcome! Ruby isn't exactly "dying", but the hype/popularity is definitely fading. This is primarily because Ruby is no longer "new", most of Ruby's popularity came from Rails, and now Rails is no longer the "new hotness". However, Ruby still has lots of awesome features and lots of awesome other libraries and frameworks, such as the new fancy irb gem that uses reline, nokogiri, chunky_png, the async gems, Dragon Ruby, SciRuby, Ronin, and the new Hanami web framework.
- What should I be learning?
- Comparable maintained Kimurai alternative?
-
In "Your Name" (2016), Mitsuha and Tesshi are seen turning a tree into their makeshift café, which is why one of the trees in the town is later missing
great for hacking at xml
-
Ditch Your Version Manager
Mike has worked hard over the years to have Nokogiri come with its dependencies. It does come with libxml and all that is required.
From https://nokogiri.org
> These dependencies are met by default by Nokogiri's packaged versions of the libxml2 and libxslt source code, but a configuration option --use-system-libraries is provided to allow specification of alternative library locations.
Some authors work hard to have their tools do the right thing and consistently.
-
Web scraping with rails
If the page is rendered as html you can use Nokogiri. It has great support and is pretty easy to get started with too.
-
Nokogiri 1.12 supports HTML5 parsing (after assimilating Nokogumbo)
And even now, pulling in a Java-based HTML5 parser is still probably easier than re-implementing in FFI, which is why I created https://github.com/sparklemotion/nokogiri/issues/2227 and would love to have this the conversation there if possible.
ferrum
- Generating PDFs in Rails using Grover
-
Learning Ruby Basics
What are you using for automation? There's a relatively new gem that I heard good things of, vessel: https://github.com/rubycdp/vessel . It uses ferrum under the hood, a set of Ruby bindings to Chrome/Chromium (https://github.com/rubycdp/ferrum).
-
Ruby web scraping gem that can handle JS?
I've used https://github.com/rubycdp/ferrum as driver for automated testing with capybara for which it works great. It recommends https://github.com/rubycdp/vessel as higher level abstraction for web scraping.
-
Automating Jekyll card generation with ruby’s Ferrum gem
require "Rubygems" require "Ferrum" def generate_card(browser, card, png, options={}) browser.go_to("http://localhost:4000/cards/#{card}") # see all the options here https://github.com/rubycdp/ferrum#screenshots browser.screenshot(path: "./images/cards/#{png}", full: true, # final image size is window_size x scale scale: 2) end browser = Ferrum::Browser.new(window_size: [800, 418]) # Check what cards we need to make Dir.glob("_posts/*").each do |post| post = File.basename(post, ".md") png = post + ".png" card = post + ".html" generate_card(browser, card, png) unless File.exists?("./images/cards/#{png}") end
- Best library for scraping dynamic page in Sidekiq background job (Selenium/Puppeteer/Cypress/Playwright)
- Ferrum – high-level API to control Chrome in Ruby
-
Migrating Selenium system tests to Cuprite
That is why we were happy to find out that a new ruby testing driver approach is being developed. It is called Cuprite, it runs the Ferrum library under the hood which, in turn, is an API that directly instruments the Chrome browser using the Chrome DevTools Protocol (CDP). About a week ago, we finally made a serious attempt to make our system test suite run on Cuprite, with especially two questions in our minds:
-
Web scraping with rails
I've used Ferrum for a couple small scripts in the past before.
-
My favorite Ruby gems
Ferrum
What are some alternatives?
Oga - Oga is an XML/HTML parser written in Ruby.
Selenium WebDriver - A browser automation framework and ecosystem.
Ox - Ruby Optimized XML Parser
cuprite - Headless Chrome/Chromium driver for Capybara
HTML::Pipeline - HTML processing filters and utilities
puppeteer - Headless Chrome Node.js API [Moved to: https://github.com/puppeteer/puppeteer]
Oj - Optimized JSON
Capybara - Acceptance test framework for web applications
ROXML - ROXML is a module for binding Ruby classes to XML. It supports custom mapping and bidirectional marshalling between Ruby and XML using annotation-style class methods, via Nokogiri or LibXML.
puffing-billy - A rewriting web proxy for testing interactions between your browser and external sites. Works with ruby + rspec.
HappyMapper - Object to XML mapping library, using Nokogiri (Fork from John Nunemaker's Happymapper)
puphpeteer - A Puppeteer bridge for PHP, supporting the entire API.