finetuner
qdrant
Our great sponsors
finetuner | qdrant | |
---|---|---|
36 | 122 | |
1,192 | 12,976 | |
4.9% | 8.1% | |
0.0 | 9.6 | |
2 months ago | 6 days ago | |
Python | Rust | |
Apache License 2.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
finetuner
-
How can I create a dataset to refine Whisper AI from old videos with subtitles?
You can try creating your own dataset. Get some audio data that you want, preprocess it, and then create a custom dataset you can use to fine tune. You could use finetuners like these if you want as well.
-
A Guide to Using OpenTelemetry in Jina for Monitoring and Tracing Applications
We derived the dataset by pre-processing the deepfashion dataset using Finetuner. The image label generated by Finetuner is extracted and formatted to produce the text attribute of each product.
-
[D] Looking for an open source Downloadable model to run on my local device.
You can either use Hugging Face Transformers as they have a lot of pre-trained models that you can customize. Or Finetuners like this one: which is a toolkit for fine-tuning multiple models.
-
Improving Search Quality for Non-English Queries with Fine-tuned Multilingual CLIP Models
Very recently, a few non-English and multilingual CLIP models have appeared, using various sources of training data. In this article, we’ll evaluate a multilingual CLIP model’s performance in a language other than English, and show how you can improve it even further using Jina AI’s Finetuner.
-
Classification using prompt or fine tuning?
you can try prompt-based classification or fine-tuning with a Finetuner. Prompts work well for simple tasks but fine-tuning may give better results for complex ones. Althouigh it's going to need more resources, but try both and see what works best for you.
-
Asking questions about lengthy texts
If you've got a set of Q&A pairs for your 60-page lease or medical paper, you could use finetuners to help answer questions about the text. But if you don't have those pairs, fine-tuning might not be good. Try summarizing the doc or extract the info. And if you're hitting the token limit, try using a bigger model or breaking up the text into smaller pieces.
-
What are the best Python libraries to learn for beginners?
Actually further in applying ML, Finetuner is pretty handy for getting the last mile done which I found useful.
-
Fine-tuning open source models to emulate ChatGPT for code explanation.
One option I’m considering is using fine tuners like the one from HuggingFace or Jina AI to fine-tune open source models like GPT-J or OPT to improve specific use-cases like code explanation. With the funding that we have, I wouldn’t want to cheap out on fine-tuning and expect something good.
-
Efficient way to tune a network by changing hyperparameters?
Off the top of my head you can either use Grid Search to test hyperparam combinations, Random Search to randomize hyperparams and Neural search uses ML to optimize hyperparameter tuning. You can use finetuners for this as well.
-
Seeking advice on improving NLP search results
Back then, I came across some info about a self-supervised sentence embedding system that surpasses Sentence Transformers NLI models, but forgot where it was. You could use Jina’s Finetuner. It lets you boost your pre-trained models' performance, making them ready for production without having to spend a lot of time labeling or buying expensive hardware.
qdrant
-
Ask HN: Who is hiring? (September 2023)
Qdrant | REMOTE | Full-time | https://qdrant.tech | https://github.com/qdrant/qdrant
Qdrant is a leading open-source Vector Database provider.
We are Looking for Technical Writer, Integrations Engineer, Database Tester, Developer Advocate(s).
All jobs https://qdrant.join.com
-
Show HN: Fast Vector Similarity Using Rust and Python
Awesome work!
At Qdrant we do this at scale. Store billions of vectors in a cluster of any size. Also in Rust which turned out to be an amazing choice, and fully open source. It uses various features to keep things performant, such as vectorization (multiple arches), quantization (form of compression) and more.
- Ask HN: Who is hiring? (August 2023)
-
Pros and cons of vector search in elastic?
It depends on your requirements, for a simple hybrid-search solution, elastic, etc. should be enough. With growing data amount and if working not only with text embeddings, you should try out a dedicated solution, like Qdrant. https://github.com/qdrant/qdrant
-
Serverless Semantic Search, Free tier only
Made with Qdrant Vector Database https://github.com/qdrant/qdrant
- Show HN: Danswer – open-source question answering across all your docs
-
I've changed my mind about Code Interpretor
As an open-source and self-hosted solution, developers can deploy their own version of the plugin and register it with ChatGPT. The plugin leverages OpenAI embeddings and allows developers to choose a vector database (Milvus, Pinecone, Qdrant, Redis, Weaviate or Zilliz) for indexing and searching documents. Information sources can be synchronized with the database using webhooks.
-
Dear VC firm, pay me to NOT work or I will steal your lunch
Qdrant
-
Tantivy 0.20 is released: Schemaless column store, Schemaless aggregations, Phrase prefix queries, Percentiles, and more...
Another example is Bloop, it is a code search engine built on top of tantivy and qdrant.
- FLaNK Stack Weekly for 12 June 2023
What are some alternatives?
Milvus - A cloud-native vector database, storage for next generation AI applications
Weaviate - Weaviate is an open source vector database that stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance and scalability of a cloud-native database, all accessible through GraphQL, REST, and various language clients.
pgvector - Open-source vector similarity search for Postgres
faiss - A library for efficient similarity search and clustering of dense vectors.
towhee - Towhee is a framework that is dedicated to making neural data processing pipelines simple and fast.
vespa - The open big data serving engine. https://vespa.ai
hnswlib - Header-only C++/python library for fast approximate nearest neighbors
Elasticsearch - Free and Open, Distributed, RESTful Search Engine
gpt_index - LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data. [Moved to: https://github.com/jerryjliu/llama_index]
Jina AI examples - Jina examples and demos to help you get started
awesome-vector-search - Collections of vector search related libraries, service and research papers
google-research - Google Research