Nokogiri
Async Ruby
Our great sponsors
Nokogiri | Async Ruby | |
---|---|---|
20 | 20 | |
6,100 | 1,983 | |
0.1% | 2.3% | |
9.5 | 7.8 | |
8 days ago | 7 days ago | |
C | Ruby | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Nokogiri
- Web Scraping in Python – The Complete Guide
-
Did you know Nokogiri now has opt-in HTML5 parsing?
release planning: v1.16.0 · Issue #2897 · sparklemotion/nokogiri
-
As a Go developer, I’m surprised Crystal isn’t more popular
What's holding me back from going all in with Crystal is I have a lot of pre-existing Ruby code, and porting Ruby code to Crystal can be tricky. For example, Crystal lacks an Enumerator class (aka generators) due to captured block semantics. I also wish the shards ecosystem was a little more mature; for example there's multiple HTML parsing libraries, but none have all of the features that Ruby's Nokogiri has. For new greenfield backend projects, I would totally use Crystal.
-
Two months into learning Ruby, it is the most beautiful language I ever learned
Welcome! Ruby isn't exactly "dying", but the hype/popularity is definitely fading. This is primarily because Ruby is no longer "new", most of Ruby's popularity came from Rails, and now Rails is no longer the "new hotness". However, Ruby still has lots of awesome features and lots of awesome other libraries and frameworks, such as the new fancy irb gem that uses reline, nokogiri, chunky_png, the async gems, Dragon Ruby, SciRuby, Ronin, and the new Hanami web framework.
- What should I be learning?
- Comparable maintained Kimurai alternative?
-
In "Your Name" (2016), Mitsuha and Tesshi are seen turning a tree into their makeshift café, which is why one of the trees in the town is later missing
great for hacking at xml
-
Ditch Your Version Manager
Mike has worked hard over the years to have Nokogiri come with its dependencies. It does come with libxml and all that is required.
From https://nokogiri.org
> These dependencies are met by default by Nokogiri's packaged versions of the libxml2 and libxslt source code, but a configuration option --use-system-libraries is provided to allow specification of alternative library locations.
Some authors work hard to have their tools do the right thing and consistently.
-
Web scraping with rails
If the page is rendered as html you can use Nokogiri. It has great support and is pretty easy to get started with too.
-
Nokogiri 1.12 supports HTML5 parsing (after assimilating Nokogumbo)
And even now, pulling in a Java-based HTML5 parser is still probably easier than re-implementing in FFI, which is why I created https://github.com/sparklemotion/nokogiri/issues/2227 and would love to have this the conversation there if possible.
Async Ruby
-
EventMachine Performance Spikes
The Async gem is the natural successor, It's actively maintained, and allows you write synchronous code is if it wasn't non-blocking, and most libraries don't need any special support for Async (exceptions are gems with C extensions that do I/O and DB libraries with connection pooling that would otherwise be thread-based).
-
Philosophy of Coroutines
https://github.com/socketry/async uses coroutines and I think in general it’s been a great model with very few downsides in practice.
-
Is ruby really slow?
There's async I/O. Here's a library that leans on Ruby 3's fiber scheduler.
-
Show HN: Goru, an experimental, Go-inspired concurrency library for Ruby
Hey folks, wanted to show this off and get feedback. Still early/experimental but there are quite a few concepts I'm excited about here. This project came about while writing a program in Go and loving its approach to concurrency. Being a long-time Rubyist I immediately started to think about what similar concepts might look like in Ruby.
I set out with two main design constraints:
1. Lightweight: I didn't want routines to be backed by fibers or threads. Having been involved some in the async project (https://github.com/socketry/async), I had some experience using fibers for concurrency but was curious if they could be avoided.
2. Explicitness: Routine behavior must be written to describe exactly how it is to behave. I always felt like concurrent code was hard to fully understand because of the indirection involved. On the spectrum between tedium and magical I wanted to err more on the side of tedium with Goru.
Goru routines are just blocks that are called once for every tick of the reactor. It is up to the developer to implement behavior in terms of a state machine, where on each tick the routine takes some action and then updates the state of the routine for the next tick. This fulfills both design constraints:
1. Because routines are just blocks, they weigh in at about ~345 bytes of memory overhead.
2. Routine behavior is explicit because it is written as a state machine inside the block.
Couple more features worth noting:
* Goru includes channels for buffered reading/writing (similar to channels in Go).
* Goru ships with primitives for non-blocking IO to easily build things like http servers.
Curious your thoughts!
- Twitter (re)Releases Recommendation Algorithm on GitHub
-
Simple MapReduce that melt my brain (yes, fibers there)
For those who are interested here is the question.
- How does Ruby handle parallel HTTP requests in separate threads?
-
Two months into learning Ruby, it is the most beautiful language I ever learned
Welcome! Ruby isn't exactly "dying", but the hype/popularity is definitely fading. This is primarily because Ruby is no longer "new", most of Ruby's popularity came from Rails, and now Rails is no longer the "new hotness". However, Ruby still has lots of awesome features and lots of awesome other libraries and frameworks, such as the new fancy irb gem that uses reline, nokogiri, chunky_png, the async gems, Dragon Ruby, SciRuby, Ronin, and the new Hanami web framework.
-
ruby has supported native async or not?
In Github, there is a Async Gem(https://github.com/socketry/async).
- Efficient IO in Linux with io_uring [pdf]
What are some alternatives?
Oga - Oga is an XML/HTML parser written in Ruby.
Concurrent Ruby - Modern concurrency tools including agents, futures, promises, thread pools, supervisors, and more. Inspired by Erlang, Clojure, Scala, Go, Java, JavaScript, and classic concurrency patterns.
Ox - Ruby Optimized XML Parser
EventMachine - EventMachine: fast, simple event-processing library for Ruby programs
HTML::Pipeline - HTML processing filters and utilities
Polyphony - Fine-grained concurrency for Ruby
Oj - Optimized JSON
Celluloid - Actor-based concurrent object framework for Ruby
ROXML - ROXML is a module for binding Ruby classes to XML. It supports custom mapping and bidirectional marshalling between Ruby and XML using annotation-style class methods, via Nokogiri or LibXML.
Sequel - Sequel: The Database Toolkit for Ruby
HappyMapper - Object to XML mapping library, using Nokogiri (Fork from John Nunemaker's Happymapper)
net-ssh - Pure Ruby implementation of an SSH (protocol 2) client