jina
Weaviate
Our great sponsors
jina | Weaviate | |
---|---|---|
126 | 76 | |
19,807 | 9,181 | |
1.0% | 5.5% | |
9.2 | 10.0 | |
7 days ago | 7 days ago | |
Python | Go | |
Apache License 2.0 | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
jina
- FLaNK Stack Weekly for 30 Oct 2023
-
Cross data type search that wasn’t supported well using Elasticsearch
Jina mainly because of their use of neural networks and AI.
-
I plan to build my own AI powered search engine for my portfolio. Do you know ones that are open-source?
Jina - It’s an open-source project where you can build search engines. Well maybe not no code but it claims that you only need a few lines of code for creating projects. The project supports semantic, text, image, audio, and video search. What I’m also interested in is with their neural search and generative AI. I’m also interested in the amount of github repo that they have. I have this on my radar since this is also something I was interested in.
-
How can we match images in our database?
Do you guys have any ideas how we can match images on our database? We’re working on a project that about matching images on our database. We were trying to use SIFT and some other similar methods, but for some reason, nothing doesn’t seem to be working that well. Does anyone have any suggestions for the most effective way to do this? Maybe some open-source solutions like HuggingFace or Jina AI? We just want to make sure our image matching is correct and that part’s been a bit of a struggle on our part.
-
Any MLOps platform you use?
Jina AI -They offer a neural search solution that can help build smarter, more efficient search engines. They also have a list of cool github repos that you can check out. Similar to Vertex AI, they have image classification tools, NLPs, fine tuners etc.
-
This week(s) in DocArray
Well, it's not exactly a new feature, but we've been working on early support for DocArray v2 in Jina.
-
Multi-model serving options
Jina let’s you serve all of your models through the same Gateway while deploying them as individual microservices. You can also tie your models together in a pipeline if needed. Also some nice ML focussed features such as dynamic batching.
-
Image matching within database? [P]
You should check out https://github.com/jina-ai/jina and https://github.com/jina-ai/finetuner
- Image Similarity Score using transfer learning
-
I want to dive into how to make search engines
What kinda thing do you want to search? Text I guess? But there are search engines for images, gifs, video, all kinds of stuff.
I'm working at an open-source project that builds an AI-powered search framework [0], and I've built some examples in very few lines of code (for searching fashion products via image or text [1], PDF text/images/tables search [2]) and one of our community members built a protein search engine [3].
A good place to start might be with a no-code solution like (shameless self-plug time) Jina NOW [4], which lets you build a search engine and GUI with just one CLI command.
Weaviate
-
pgvecto.rs alternatives - qdrant and Weaviate
3 projects | 13 Mar 2024
- FLaNK Stack 29 Jan 2024
- Qdrant, the Vector Search Database, raised $28M in a Series A round
-
How to use Weaviate to store and query vector embeddings
In this tutorial, I introduce Weaviate, an open-source vector database, with the thenlper/gte-base embedding model from Alibaba, through Hugging Face's transformers library.
-
Choosing vector database: a side-by-side comparison
This will be solved in Weaviate https://github.com/weaviate/weaviate/issues/2424
-
Who's hiring developer advocates? (October 2023)
Link to GitHub -->
-
Do we think about vector dbs wrong?
Hey @rvrs, I work on Weaviate and we are doing some improvements around increasing write throughput:
1. gRPC. Using gRPC to write vectors has had a really nice performance boost. It is released in Weaviate core but here is still some work on do on the clients. Feel free to get in contact if you would like to try it out.
2. Parameter tuning. lowering `efConstruction` can speed up imports.
3. We are also working on async indexing https://github.com/weaviate/weaviate/issues/3463 which will further speed things up.
In comparison with pgvector, Weaviate has more flexible query options such as hybrid search and quantization to save memory on larger datasets.
-
Pros and cons of vector search in elastic?
Highly opinionated as I'm working for Weaviate, so take my comment with a large portion of salt.
My highly opinionated view is that for Elastic, they're not really open source and the dependency on Java of the Lucene ecosystem is a big disadvantage, so as you already said, speed, they're getting better at this, but if you need to scale, this problem scales with you.
So if you already have ELK stack and don't need to scale, sure go for it otherwise, Weaviate offers real open source, so use it for free on your own infrastructure https://github.com/weaviate/weaviate
-
Lost on LangChain: Can someone help with the Question Answer concept?
If you do not wish to store your private data on pinecone you can use open source alternatives like Weaviate where you can spin up your own instance. Other option could be to use Agents. You'll need to find sutaible agent for your database which will allow LLMs to directly query data from your private database.
-
Questions about memory, tree-of-thought, planning
I tried cromadb but had terrible performance and could not pin down the cause (likely a problem on my end). Weaviate was easy to setup and had excellent performance, this is probably what I will use in the future. Next on my list is txtinstruct, to finetune a model with data that does not change and using a vector db for everything else seems promising.
What are some alternatives?
Milvus - A cloud-native vector database, storage for next generation AI applications
faiss - A library for efficient similarity search and clustering of dense vectors.
pgvector - Open-source vector similarity search for Postgres
qdrant - Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
haystack - :mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
dalle-flow - 🌊 A Human-in-the-Loop workflow for creating HD images from text
whoogle-search - A self-hosted, ad-free, privacy-respecting metasearch engine
vald - Vald. A Highly Scalable Distributed Vector Search Engine
ChatterBot - ChatterBot is a machine learning, conversational dialog engine for creating chat bots
es-clip-image-search - Sample implementation of natural language image search with OpenAI's CLIP and Elasticsearch or Opensearch.
marqo - Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
go - The Go programming language