Our great sponsors
-
MeiliSearch
A lightning-fast search API that fits effortlessly into your apps, websites, and workflow
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I followed the comments from https://github.com/meilisearch/MeiliSearch/discussions/1523, took a look at the linked test cases and hacked something together. I don’t have the code on me to share but it’s fairly close to the example test cases to spin up an embedded instance.
I am not sure about which link you are talking about, this one about LMDB and its memory usage works. Could you please open an issue on our documentation, please?
The big issue with compiling milli (meilisearch's rust search engine library) is that it uses LMDB. I noticed some possible smaller issues as well, but those can hopefully be worked out easily in the future. TL;DR: LMDB won't compile to WASI in the next few years, if ever. You need a WASM-friendly replacement. Looked into other options here but none are really suitable. Thus, only idea I could come up with is making a LMDB polyfill that uses IndexedDB under the hood for web support. See here: https://github.com/meilisearch/heed/issues/162. I plan on making a PR for it at some point but I have no clue when since it is a decently large feat. Side note: if you wanted to help, I would be happy to have it!
Fortunately, our team is mature enough and involved in open-source to make them improve. For example, we work closely with the maintainer of the Japanese tokenizer library and we also forked the analytics-rust library, which is now also used by non-Meilisearch users!
Fortunately, our team is mature enough and involved in open-source to make them improve. For example, we work closely with the maintainer of the Japanese tokenizer library and we also forked the analytics-rust library, which is now also used by non-Meilisearch users!
LMDB is much more sain in the sense that it supports real ACID transactions instead of savepoints for RocksDB. The latter is heavy and consumes a lot more memory for a lot less read throughput. However, RocksDB has a much better parallel and concurrent write story, where you can merge entries with merge functions and therefore write from multiple CPUs.
Also, WASI has extremely rudimentary emulated memory mapping support but I would hardly call it working. You can see the current implementation here, it is pretty short: https://github.com/WebAssembly/wasi-libc/blob/main/libc-bottom-half/mman/mman.c
500kB sounds, like could be just shipped to the client lazily? https://github.com/tinysearch/tinysearch
An option there is https://pagefind.app/ — not as fast as a persistent server but solves some of the deployment and bandwidth issues.
There are issues and pull requests but I advise you to look at the milli folder in the Meilisearch repository, it’s where all the logic is done. We extensively use RoaringBitmaps, heed the LMDB wrapper and grenad when indexing.