elasticsearch-py
meilisearch-js
Our great sponsors
elasticsearch-py | meilisearch-js | |
---|---|---|
21 | 15 | |
4,121 | 664 | |
0.8% | 3.5% | |
8.7 | 8.7 | |
5 days ago | 17 days ago | |
Python | TypeScript | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
elasticsearch-py
- An alternative to Elasticsearch that runs on a few MBs of RAM
- Elastic Open Sources Their Endpoint Security Protection YARA Ruleset
-
OpenSearch – open-source search and analytics based on Apache 2.0 Elasticsearch
And my bet is it's the one most are going to be using from now on. I used to think this was a fairly black and white issue, but now two things have coloured it for me.
Firstly, dick moves like this: https://github.com/elastic/elasticsearch-py/pull/1623
Secondly, I don't buy the argument from Elastic any more. Yes, the ethical thing to do when you're making money from someone's work is at least contribute back. At the same time though, they're making money from packaging it up and selling it _as a service_. That "as a service" part is where they're making the bucks.
A bonus thirdly; OpenSearch really is Open Source, and ElasticSearch no longer is.
FD: I have a friend who works at Elastic, though he doesn't really colour my opinions of things.
> Firstly, dick moves like this: https://github.com/elastic/elasticsearch-py/pull/1623
I understand that this is unpopular, but you can make a very strong argument that it's to prevent weird errors in the future. I'm also guilty of littering my code with Asserts to ensure the universe is working fine.
The alternative is to allow it to work and then you end up with weird issues like when you connect mysql client to mariadb server (and vice-versa): https://stackoverflow.com/questions/50169576/mysql-8-0-11-er...
> Secondly, I don't buy the argument from Elastic any more. Yes, the ethical thing to do when you're making money from someone's work is at least contribute back. At the same time though, they're making money from packaging it up and selling it _as a service_. That "as a service" part is where they're making the bucks.
That's just an opinion, yes they have a service, and yes it competes with Amazon. Is it cool for Amazon to take a body of work and sell it without supporting it? Are amazon actually supporting it? Is it the same as Elastic using Lucene? (not really because Elastic submits a the majority of fixes to Lucene, but, you get it).
it's kinda gray, I'm sure Amazon thinks they're the good guy, but it's hard for me to look at Elastic as the bad guy in all this.
-
I Don't Think Elasticsearch Is a Good Logging System
Oh man, https://github.com/elastic/elasticsearch-py/issues/1734 is a disappointing read. I know ES wants to save their business, but alienating users isn't exactly the path to success.
- Official Elasticsearch Python library no longer works with open-source forks
- Elasticsearch adding code to reject connections to OpenSearch clusters or to clusters running open source distributions of ES7
meilisearch-js
-
Show HN: Podcastsaver.com – a search engine testbench dressed as a podcast site
If you remove the URLs from indexation, it'll generally save a ton of place and will be much, much faster to index. We are thinking about not indexing URLs by default; you can help us by explaining your use case here -> https://github.com/meilisearch/product/discussions/553
Just a detail, if you're making a `du -sh` on your computer, the size on the disk will stay unchanged because we are doing soft deletion ;). Don't worry. It will be physically deleted after a while if you need it in the future.
If you kept the default configuration of Meilisearch, the maximum size of the HTTP payload is 100Mb (for security). You change it here -> https://docs.meilisearch.com/learn/configuration/instance_op...
addDocumentsInBatches() is just an helper to send your big json array into multiple parts, not absolutely sure you'll need it. (Code -> https://github.com/meilisearch/meilisearch-js/blob/807a6d827...)
Thanks! I removed the URLs and now the searchable attributes are only title, description and some author fields!
> Just a detail, if you're making a `du -sh` on your computer, the size on the disk will stay unchanged because we are doing soft deletion ;). Don't worry. It will be physically deleted after a while if you need it in the future.
Ah I was just wildy undershooting the size I gave the PVC! I just gave it much more and it's fine -- right now it's resting around 19Gi of usage, which is actually a bit of a problem considering the data set was only like 4GB or something like that originally. That said, disk is really not an issue so I'll just throw more at it, maybe leave it at 32GB and call it a day (it's around 1.6MM documents out of ~2MM), so shouldn't be too much more.
> If you kept the default configuration of Meilisearch, the maximum size of the HTTP payload is 100Mb (for security). You change it here -> https://docs.meilisearch.com/learn/configuration/instance_op...
Thanks for this, I'll keep this in mind -- so I could actually pass off HUGE chunks to Meilisearch.
It seems like the larger the chunk the more efficient? There didn't seem to be much of a change in how much time it took to work through a chunk of documents, more just that having lots of smaller chunks would go slower. I started off with 10k in a batch, then went to 1k then back to 5k, maybe I should go to 100k docs in a batch and see the performance.
There's a blog post waiting to be written in here...
> addDocumentsInBatches() is just an helper to send your big json array into multiple parts, not absolutely sure you'll need it. (Code -> https://github.com/meilisearch/meilisearch-js/blob/807a6d827...)
Thanks! Was this something someone requested? Is there a tangible benefit (were there some customers that didn't want to split up the payloads themselves)? Because it seems like unnecessary cruft in the API otherwise.
-
What do you use for e-commerce search?
You could use Meilisearch: https://www.meilisearch.com/
-
Official /r/rust "Who's Hiring" thread for job-seekers and job-offerers [Rust 1.61]
COMPANY: Meilisearch, here is our website and Github repository.
-
What are your Most Used Self Hosted Applications?
Meilisearch - Provides search for the main BookStack static site/docs/blog.
-
8 Open Source Projects for Your Ecommerce Stack
Meilisearch is an open source search engine that adds highly performant search engines to any website or app, including ecommerce stores.
-
Review: Saleor vs Medusa Two Opensource Headless Ecommerce Platforms
Medusa allows you to integrate any search engine of your choice into the platform. It already integrates with search systems like Meilisearch or Algolia to provide an intuitive search experience for the customers.
-
Build Your Own E-Commerce Keystone.js-Based System — Requirements and Architecture
Not so long ago I was working on a system based on Keystone.js CMS. But there it was used much more sophisticated way than just as basic headless CMS. I was easily able to extend it with search engine (Rust-based Meilisearch) and connect to external APIs.
-
OpenSearch – open-source search and analytics based on Apache 2.0 Elasticsearch
Only semi-related, but I've recently started using https://www.meilisearch.com/. It's relatively limited, but works great for small use cases. It's also pretty easy to operate. I'm hoping as it continues to grow it will support more features and use cases. I don't think the creators intend to address the same depth of complex features in ElasticSearch (and the like), but that's a desirable attribute in my opinion.
What are some alternatives?
searxng - SearXNG is a free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled.
quickwit - Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
Directus - The Modern Data Stack 🐰 — Directus is an instant REST+GraphQL API and intuitive no-code data collaboration app for any SQL database.
Vue Storefront - Alokai is a Frontend as a Service solution that simplifies composable commerce. It connects all the technologies needed to build and deploy fast & scalable ecommerce frontends. It guides merchants to deliver exceptional customer experiences quickly and easily.
helm-charts
evtx2es - A library for fast parse & import of Windows Eventlogs into Elasticsearch.
qryn - qryn is a polyglot, high-performance observability framework for ClickHouse. Ingest, store and analyze logs, metrics and telemetry traces from any agent supporting Loki, Prometheus, OTLP, Tempo, Elastic, InfluxDB and many more formats and query transparently using Grafana or any other compatible client.
zeek-clickhouse
git-imerge - Incremental merge for git
Saleor - Saleor Core: the high performance, composable, headless commerce API.
orama - 🌌 Fast, dependency-free, full-text and vector search engine with typo tolerance, filters, facets, stemming, and more. Works with any JavaScript runtime, browser, server, service!
google-api-nodejs-client - Google's officially supported Node.js client library for accessing Google APIs. Support for authorization and authentication with OAuth 2.0, API Keys and JWT (Service Tokens) is included.