milli vs sage

milli

Search engine library for Meilisearch ⚡️ (by meilisearch)

search-engine Lmdb

DISCONTINUED

Suggest alternative

Edit details

sage

Proteomics search & quantification so fast that it feels like magic (by lazear)

Bioinformatics proteomics mass-spectrometry

Source Code

sage-docs.vercel.app

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

milli		sage
	Project
8	Mentions	5
462	Stars	194
-	Growth	-
9.0	Activity	7.7
about 1 year ago	Latest Commit	5 days ago
Rust	Language	Rust
MIT License	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

milli

Posts with mentions or reviews of milli. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-11-05.

Proteomics search engine written in Rust
5 projects | /r/rust | 5 Nov 2022

Is this a posting list? There is a lot of bioinformatics in this post, but if I squint, some of the problems do look like general information retrieval problems. Even the discussion of ordering the arrays by mass sounds like search relevance scores and makes me wonder if it makes sense to try to get something off the shelf like meillisearch/milli or tantavy to support this use case.
Zettelkasten Options
13 projects | /r/emacs | 19 Feb 2022

I'm currently not using any tool, although I am playing around with binding Milli and the most up-to-date Mentat fork to emacs with emacs-module-rs.
Meilisearch, the Rust search engine, just raised $5M
7 projects | /r/rust | 27 Jan 2022

Yeah, we have already done that, the internal engine is called milli and could even be published on crates.io one day! The issue is with the design of the storage system itself, we use LMDB right now but maybe we can find another way to index faster and to be more oriented to distributed systems.
MeiliSearch: A Minimalist Full-Text Search Engine
8 projects | news.ycombinator.com | 15 Aug 2021

They have another prototype engine with more advanced features and performance too.
https://github.com/meilisearch/milli
MeiliSearch v0.21, the long-awaited update of our search engine in Rust is out!
2 projects | /r/rust | 2 Jul 2021

You can look at the milli repository this is the library that we use and work on. MeiliSearch is the HTTP actix-web based server that serves the milli indices.
MeiliSearch needs your help, an undefined behavior can be the cause of a strange bug
1 project | /r/rust | 24 Jun 2021
What's everyone working on this week (17/2021)?
8 projects | /r/rust | 26 Apr 2021

This library is the main bottleneck of the new MeiliSearch search engine. We will soon release a beta version, keep watching!
What’s everyone working on this week (16/2021)?
10 projects | /r/rust | 19 Apr 2021

Working on the new MeiliSearch engine, reworked from scratch! There already is excellent external contributions 🎉

sage

Posts with mentions or reviews of sage. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-06-06.

Does anyone know a great guide/documentation explaining how to implement Percolator?
2 projects | /r/proteomics | 6 Jun 2023

If you want to implement LDA from scratch, you could check out how Sage is doing it.
What are some good examples of well-engineered bioinformatics pipelines?
8 projects | /r/bioinformatics | 5 Apr 2023

You could check out https://github.com/lazear/sage - it's a near comprehensive program/pipeline for analyzing DDA/shotgun proteomics data. Most proteomics pipelines consist of running multiple, separate tools in sequence (search, spectrum rescoring, retention time prediction, quantification), but sage performs all of these. This cuts down on the need for disk space for storing intermediate results (none required), the need for IO (files are read once), and results in a proteomics pipeline that is >10-1000x faster than anything else, including commercial solutions
Proteomics search engine written in Rust
5 projects | /r/rust | 5 Nov 2022

You can also check out the intro blog post if you're interesting in learning more about the algorithm behind Sage. Beyond being fast, it also includes integrated machine learning (linear discriminant analysis, KDE) for rescoring spectral matches.
Opinions on AlphaPept
2 projects | /r/proteomics | 30 Oct 2022

You could try out Sage, if you're looking for speed - I don't think you'll find anything faster. https://github.com/lazear/sage

What are some alternatives?

When comparing milli and sage you can also consider the following projects:

Typesense - Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences

rnaseq - RNA sequencing analysis pipeline using STAR, RSEM, HISAT2 or Salmon with gene/isoform counts and extensive quality control.

tantivy - Tantivy is a full-text search engine library inspired by Apache Lucene and written in Rust

seqkit - A cross-platform and ultrafast toolkit for FASTA/Q file manipulation

pgroonga - PGroonga is a PostgreSQL extension to use Groonga as index. PGroonga makes PostgreSQL fast full text search platform for all languages!

fasten - :construction_worker: Fasten toolkit, for streaming operations on fastq files

vespa - AI + Data, online. https://vespa.ai

mokapot - Fast and flexible semi-supervised learning for peptide detection in Python

prost - PROST! a Protocol Buffers implementation for the Rust Language

juicer - A One-Click System for Analyzing Loop-Resolution Hi-C Experiments

heed - A fully typed LMDB wrapper with minimum overhead 🐦

Rust-Bio - This library provides implementations of many algorithms and data structures that are useful for bioinformatics. All provided implementations are rigorously tested via continuous integration.