SaaSHub helps you find the best software and product alternatives Learn more →
Nokogiri Alternatives
Similar projects and alternatives to Nokogiri
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
-
-
-
-
crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
-
-
-
-
-
nokolexbor
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
-
-
-
-
-
-
-
-
ROXML
ROXML is a module for binding Ruby classes to XML. It supports custom mapping and bidirectional marshalling between Ruby and XML using annotation-style class methods, via Nokogiri or LibXML.
-
Nokogiri discussion
Nokogiri reviews and mentions
-
Nokolexbor: Drop-in replacement for Nokogiri. 5.2x faster at parsing HTML
It seems to have an in-tree libxml 2.11 for XPath support, which was released in 2023-04. Almost every second libxml release comes with a CVE, so I'm curious if there's plans to upgrade the libxml version, since it doesn't use the system libxml (same as nokogiri).
One of the reasons I still use nokogiri is because it puts a lot of effort into keeping libxml updated: https://github.com/sparklemotion/nokogiri/releases
-
How we made a Ruby method 200x faster
(2.1) Compare tagName in the selector
Apparently Nokogiri implements CSS in a very inefficient way though by collecting ancestors and then converting the CSS into xpath and matching that:
https://github.com/sparklemotion/nokogiri/blob/e8d30a71d70b2...
https://github.com/sparklemotion/nokogiri/blob/e8d30a71d70b2...
I'd expect that to be an order of magnitude slower than what a browser does.
-
11 best open-source web crawlers and scrapers in 2024
Language: Ruby | GitHub: 6.1K+ stars | link
-
Lexbor – an open source HTML Renderer library
[2] https://github.com/sparklemotion/nokogiri/issues/2204
- Web Scraping in Python – The Complete Guide
-
Did you know Nokogiri now has opt-in HTML5 parsing?
release planning: v1.16.0 · Issue #2897 · sparklemotion/nokogiri
-
As a Go developer, I’m surprised Crystal isn’t more popular
What's holding me back from going all in with Crystal is I have a lot of pre-existing Ruby code, and porting Ruby code to Crystal can be tricky. For example, Crystal lacks an Enumerator class (aka generators) due to captured block semantics. I also wish the shards ecosystem was a little more mature; for example there's multiple HTML parsing libraries, but none have all of the features that Ruby's Nokogiri has. For new greenfield backend projects, I would totally use Crystal.
-
Two months into learning Ruby, it is the most beautiful language I ever learned
Welcome! Ruby isn't exactly "dying", but the hype/popularity is definitely fading. This is primarily because Ruby is no longer "new", most of Ruby's popularity came from Rails, and now Rails is no longer the "new hotness". However, Ruby still has lots of awesome features and lots of awesome other libraries and frameworks, such as the new fancy irb gem that uses reline, nokogiri, chunky_png, the async gems, Dragon Ruby, SciRuby, Ronin, and the new Hanami web framework.
- What should I be learning?
- Comparable maintained Kimurai alternative?
-
A note from our sponsor - SaaSHub
www.saashub.com | 19 Jan 2025
Stats
sparklemotion/nokogiri is an open source project licensed under MIT License which is an OSI approved license.
The primary programming language of Nokogiri is C.