-
nokolexbor
High-performance HTML5 parser for Ruby based on Lexbor, with support for both CSS selectors and XPath.
-
CodeRabbit
CodeRabbit: AI Code Reviews for Developers. Revolutionize your code reviews with AI. CodeRabbit offers PR summaries, code walkthroughs, 1-click suggestions, and AST-based analysis. Boost productivity and code quality across all major languages with each PR.
-
Lexbor can also be used from Python: https://github.com/rushter/selectolax
-
It seems to have an in-tree libxml 2.11 for XPath support, which was released in 2023-04. Almost every second libxml release comes with a CVE, so I'm curious if there's plans to upgrade the libxml version, since it doesn't use the system libxml (same as nokogiri).
One of the reasons I still use nokogiri is because it puts a lot of effort into keeping libxml updated: https://github.com/sparklemotion/nokogiri/releases
-
selma
Selma selects and matches HTML nodes using CSS rules. Backed by Rust's lol_html parser. (by gjtorikian)
You may also be interested in https://github.com/gjtorikian/selma for high performance HTML manipulation. It’s built on Rust—Cloudflare’s lol_html parser to be precise.