Did you know Nokogiri now has opt-in HTML5 parsing?

This page summarizes the projects mentioned and recommended in the original post on /r/ruby

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • Nokogiri

    Nokogiri (鋸) makes it easy and painless to work with XML and HTML from Ruby.

    RFC: Explore alternatives to libxml2 for HTML parsing · Issue #2064 · sparklemotion/nokogiri, the original discussion that ended with the decision to merge Nokogumbo into Nokogiri

  • Ruby on Rails

    Ruby on Rails

    I noticed this existed because of recent This Week in Rails (official reails newsletter from Rails team), which mentioned Rails PR Update Action View to use HTML5 standards-compliant sanitizers, which then mentioned that Nokogiri now has opt-in Rails5 parsers (Cruby-only, not supported on JRuby).

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

  • loofah

    Ruby library for HTML/XML transformation and sanitization

    loofah and rails-html-sanitizer gems follow nokogiri's lead to have opt-in HTML5 parsing (using nokogiri), using HTML5 classes -- if you use the default existing legacy API, you still get HTML4 parsing.

  • rails-html-sanitizer

    loofah and rails-html-sanitizer gems follow nokogiri's lead to have opt-in HTML5 parsing (using nokogiri), using HTML5 classes -- if you use the default existing legacy API, you still get HTML4 parsing.

  • rails-dom-testing

    Extracting DomAssertions and SelectorAssertions from ActionView.

    Switching to HTML5 parsing for everything is a very good idea, but in a lot of cases it's not straightforward because -- blurgh -- people tend to write unit tests that assert on the exact output string. Currently Discourse is dealing with this, and I've started work to upgrade Mastodon and am hitting similar problems. (Worth noting that Rails provides a test helper, assert_dom_equals, that should cover the majority of use cases.)

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts