rum vs ripgrep

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

rum		ripgrep
	Project
11	Mentions	348
693	Stars	45,040
0.7%	Growth	-
4.0	Activity	9.3
4 months ago	Latest Commit	10 days ago
C	Language	Rust
GNU General Public License v3.0 or later	License	The Unlicense

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

rum

Posts with mentions or reviews of rum. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-10.

Code Search Is Hard
13 projects | news.ycombinator.com | 10 Apr 2024
the rum index has worked well for us on roughly 1TB of pdfs. written by postgrespro, same folks who wrote core text search and json indexing. not sure why rum not in core. we have no problems.
```
   https://github.com/postgrespro/rum
```
Is it worth using Postgres' builtin full-text search or should I go straight to Elastic?
2 projects | /r/PostgreSQL | 25 Apr 2023

If you need ranking, and you have the possibility to install PostgreSQL extensions, then you can consider an extension providing RUM indexes: https://github.com/postgrespro/rum. Otherwise, you'll have to use an "external" FTS engine like ElasticSearch.
Features I'd Like in PostgreSQL
14 projects | news.ycombinator.com | 28 Jan 2023

>Reduce the memory usage of prepared queries
Yes query plan reuse like every other db, this still blows me away PG replans every time unless you explicitly prepare and that's still per connection.
Better full-text scoring is one for me that's missing in that list, TF/IDF or BM25 please see: https://github.com/postgrespro/rum
Ask HN: Books about full text search
3 projects | news.ycombinator.com | 24 Nov 2022
for postgres, i highly recommend the rum index over the core fts. rum is written by postgrespro, who also wrote core fts and json indexing in pg.
```
    https://github.com/postgrespro/rum
```
Postgres Full Text Search vs. the Rest
21 projects | news.ycombinator.com | 14 Oct 2022

My experience with Postgres FTS (did a comparison with Elastic a couple years back), is that filtering works fine and is speedy enough, but ranking crumbles when the resulting set is large.
If you have a large-ish data set with lots of similar data (4M addresses and location names was the test case), Postgres FTS just doesn't perform.
There is no index that helps scoring results. You would have to install an extension like RUM index (https://github.com/postgrespro/rum) to improve this, which may or may not be an option (often not if you use managed databases).
If you want a best of both worlds, one could investigate this extensions (again, often not an option for managed databases): https://github.com/matthewfranglen/postgres-elasticsearch-fd...
Either way, writing something that indexes your postgres database into elastic/opensearch is a one time investment that usually pays off in the long run.
Postgres Full-Text Search: A Search Engine in a Database
3 projects | news.ycombinator.com | 11 Jul 2022

Mandatory mention of the RUM extension (https://github.com/postgrespro/rum) if this caught your eye. Lots of tutorials and conference presentations out there showcasing the advantages in terms of ranking, timestamps...

10 projects | news.ycombinator.com | 27 Jul 2021

You might be just fine adding an unindexed tsvector column, since you've already filtered down the results.
The GIN indexes for FTS don't really work in conjunction with other indices, which is why https://github.com/postgrespro/rum exists. Luckily, it sounds like you can use your existing indices to filter and let postgres scan for matches on the tsvector.
Postgrespro/rum: RUM access method – inverted index with additional information
1 project | news.ycombinator.com | 17 Dec 2021
Debugging random slow writes in PostgreSQL
1 project | news.ycombinator.com | 15 May 2021

We have been bitten by the same behavior. I gave a talk with a friend about this exact topic (diagnosing GIN pending list updates) at PGCon 2019 in Ottawa[1][2].
What you need to know is that the pending list will be merged with the main b-tree during several operations. Only one of them is so extremely critical for your insert performance - that is during actual insert. Both vacuum and autovacuum (including autovacuum analyze but not direct analyze) will merge the pending list. So frequent autovacuums are the first thing you should tune. Merging on insert happens when you exceed the gin_pending_list_limit. In all cases it is also interesting to know which memory parameter is used to rebuild the index as that inpacts how long it will take: work_mem (when triggered on insert), autovacuum_work_mem (when triggered during autovauum) and maintainance_work_mem (triggered by a call to gin_clean_pending_list()) define how much memory can be used for the rebuild.
What you can do is:
- tune the size of the pending list (like you did)
- make sure vacuum runs frequently
- if you have a bulk insert heavy workload (ie. nightly imports), drop the index and create it after inserting rows (not always makes sense business wise, depends on your app)
- disable fastupdate, you pay a higher cost per insert but remove the fluctuctuation when the merge needs to happen
The first thing was done in the article. However I believe the author still relies on the list being merged on insert. If vacuums were tuned agressively along with the limit (vacuums can be tuned per table). Then the list would be merged out of bound of ongoing inserts.
I also had the pleasure of speaking with one main authors of GIN indexes (Oleg Bartunov) during the mentioned PGCon. He gave probably the best solution and informed me to "just use RUM indexes". RUM[3] indexes are like GIN indexes, without the pending list and with faster ranking, faster phrase searches and faster timestamp based ordering. It is however out of the main postgresql release so it might be hard to get it running if you don't control the extensions that are loaded to your Postgres instance.
[1] - wideo https://www.youtube.com/watch?v=Brt41xnMZqo&t=1s
[2] - slides https://www.pgcon.org/2019/schedule/attachments/541_Let's%20...
[3] - https://github.com/postgrespro/rum
Show HN: Full text search Project Gutenberg (60m paragraphs)
5 projects | news.ycombinator.com | 24 Jan 2021

I suggest to have a look at https://github.com/postgrespro/rum if you haven’t yet. It solves the issue of slow ranking in PostgreSQL FTS.

ripgrep

Posts with mentions or reviews of ripgrep. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-17.

Ask HN: What software sparks joy when using?
10 projects | news.ycombinator.com | 17 Apr 2024

ripgrep - https://github.com/BurntSushi/ripgrep
Code Search Is Hard
13 projects | news.ycombinator.com | 10 Apr 2024

Basic code searching skills seems like something new developers are never explicitly taught, but which is an absolutely crucial skill to build early on.
I guess the knowledge progression I would recommend would look something kind this:
- Learning about Ctrl+F, which works basically everywhere.
- Transitioning to ripgrep https://github.com/BurntSushi/ripgrep - I wouldn't even call this optional, it's truly an incredible and very discoverable tool. Requires keeping a terminal open, but that's a good thing for a newbie!
- Optional, but highly recommended: Learning one of the powerhouse command line editors. Teenage me recommended Emacs; current me recommends vanilla vim, purely because some flavor of it is installed almost everywhere. This is so that you can grep around and edit in the same window.
- In the same vein, moving back from ripgrep and learning about good old fashioned grep, with a few flags rg uses by default: `grep -r` for recursive search, `grep -ri` for case insensitive recursive search, and `grep -ril` for case insensitive recursive "just show me which files this string is found in" search. Some others too, season to taste.
- Finally hitting the wall with what ripgrep can do for you and switching to an actual indexed, dedicated code search tool.
Level Up Your Dev Workflow: Conquer Web Development with a Blazing Fast Neovim Setup (Part 1)
12 projects | dev.to | 16 Mar 2024

live grep: ripgrep
Ripgrep
1 project | news.ycombinator.com | 25 Feb 2024
Modern Java/JVM Build Practices
9 projects | news.ycombinator.com | 4 Jan 2024

The world has moved on though to opinionated tools, and Rust isn't even the furthest in that direction (That would be Go). The equivalent of those two lines in Cargo.toml would be this example of a basic configuration from the jacoco-maven-plugin: https://www.jacoco.org/jacoco/trunk/doc/examples/build/pom.x... - That's 40 lines in the section to do the "defaults".
Yes, you could add a load of config for files to include/exclude from coverage and so on, but the idea that that's a norm is way more common in Java projects than other languages. Like here's some example Cargo.toml files from complicated Rust projects:
Servo: https://github.com/servo/servo/blob/main/Cargo.toml
rust-gdext: https://github.com/godot-rust/gdext/blob/master/godot-core/C...
ripgrep: https://github.com/BurntSushi/ripgrep/blob/master/Cargo.toml
socketio: https://github.com/1c3t3a/rust-socketio/blob/main/socketio/C...
Ugrep – a more powerful, ultra fast, user-friendly, compatible grep
27 projects | news.ycombinator.com | 30 Dec 2023

I'm not clear on why you're seeing the results you are. It could be because your haystack is so small that you're mostly just measuring noise. ripgrep 14 did introduce some optimizations in workloads like this by reducing match overhead, but I don't think it's anything huge in this case. (And I just tried ripgrep 13 on the same commands above and the timings are similar if a tiny bit slower.)
[1]: https://github.com/radare/ired
[2]: https://github.com/BurntSushi/ripgrep/discussions/2597
Tell HN: My Favorite Tools
14 projects | news.ycombinator.com | 24 Dec 2023
Potencializando Sua Experiência no Linux: Conheça as Ferramentas em Rust para um Desenvolvimento Eficiente
5 projects | dev.to | 12 Dec 2023

Explore o Ripgrep no repositório oficial: https://github.com/BurntSushi/ripgrep
Scrybble is the ReMarkable highlights to Obsidian exporter I have been looking for
9 projects | /r/RemarkableTablet | 7 Dec 2023

🔎🗃️ ripgrep or ugrep (search fast, use regex patterns or fuzzy search, pipe output to bash/zsh shell for further processing V coloring)
RFC: Add ngram indexing support to ripgrep (2020)
2 projects | news.ycombinator.com | 30 Nov 2023

What are some alternatives?

When comparing rum and ripgrep you can also consider the following projects:

postgres-elasticsearch-fdw - Postgres to Elastic Search Foreign Data Wrapper

telescope-live-grep-args.nvim - Live grep with args

recoll - recoll with webui in a docker container

fd - A simple, fast and user-friendly alternative to 'find'

zombodb - Making Postgres and Elasticsearch work together like it's 2023

ugrep - ugrep 5.1: A more powerful, ultra fast, user-friendly, compatible grep. Includes a TUI, Google-like Boolean search with AND/OR/NOT, fuzzy search, hexdumps, searches (nested) archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more

pgvector - Open-source vector similarity search for Postgres

the_silver_searcher - A code-searching tool similar to ack, but faster.

pg_search - pg_search builds ActiveRecord named scopes that take advantage of PostgreSQL’s full text search

fzf - :cherry_blossom: A command-line fuzzy finder

pg_cjk_parser - Postgres CJK Parser pg_cjk_parser is a fts (full text search) parser derived from the default parser in PostgreSQL 11. When a postgres database uses utf-8 encoding, this parser supports all the features of the default parser while splitting CJK (Chinese, Japanese, Korean) characters into 2-gram tokens. If the database's encoding is not utf-8, the parser behaves just like the default parser.

alacritty - A cross-platform, OpenGL terminal emulator.

rum vs postgres-elasticsearch-fdw ripgrep vs telescope-live-grep-args.nvim rum vs recoll ripgrep vs fd rum vs zombodb ripgrep vs ugrep rum vs pgvector ripgrep vs the_silver_searcher rum vs pg_search ripgrep vs fzf rum vs pg_cjk_parser ripgrep vs alacritty

Compare rum vs ripgrep and see what are their differences.

rum

ripgrep

rum

ripgrep

What are some alternatives?