instructor-embedding vs openai-cookbook

instructor-embedding

[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings (by xlang-ai)

Source Code

Suggest alternative

Edit details

openai-cookbook

Examples and guides for using the OpenAI API (by openai)

openai chatgpt gpt-3 gpt-4 Docs gpt-35-turbo

Source Code

cookbook.openai.com

Suggest alternative

Edit details

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

instructor-embedding		openai-cookbook
	Project
4	Mentions	215
1,703	Stars	55,954
3.1%	Growth	1.0%
5.9	Activity	9.5
10 days ago	Latest Commit	6 days ago
Python	Language	MDX
Apache License 2.0	License	MIT License

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

instructor-embedding

Posts with mentions or reviews of instructor-embedding. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-07-10.

My experience on starting with fine tuning LLMs with custom data
7 projects | /r/LocalLLaMA | 10 Jul 2023

If you li embeddings and vector DB, you should look into this: https://github.com/HKUNLP/instructor-embedding
Build Personal ChatGPT Using Your Data
14 projects | news.ycombinator.com | 8 Jul 2023

If you look at a embeddings leaderboard [1], one of the top competitors called InstructorXL [2] is just a pip install away. It's neck and neck with Ada v2 except for a shorter input length and half the dimensions, with the added benefit that you'll always have the model available.
Most of the other options just work with the transformers library.
[1] https://huggingface.co/spaces/mteb/leaderboard
[2] https://github.com/HKUNLP/instructor-embedding
I've made a customisable SMS personal assistant which has infinite and persistent semantic memory.
2 projects | /r/LocalLLaMA | 27 May 2023

Use instructor-embedding to to make it 100% local and even maybe quick relationship lookup (embed relationship info with sentiment analysis instruction)
Whisper Transcription Formatting
1 project | /r/artificial | 3 Feb 2023

First.I believe having srt subtitles as whisper result would be better.Essentially you don't need just a list of words like YouTube does.You need something more structured.I don't remember what whisper outputs so I might be wrong.There is whisperx for that as example. And then maybe use gpt index over it.Or something like instructor model That can work.

openai-cookbook

Posts with mentions or reviews of openai-cookbook. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-04-19.

Question-Answer System Architectures using LLMs
1 project | dev.to | 29 Apr 2024

A pretrained LLM is a closed-book system: It can only access information that it was trained on. With domain fine-tuning, the system manifests additional material. An early prototype of this technique was shown in this OpenAi cookbook: For the target domain, text was embedded using an API, and then when using the LLM, embeddings were retrieved using semantic similarity search to formulate an answer. Although this approach evolved to retrieval-augmented generation, its still a technique to adapt a Gen2 (2020) or Gen3 (2022) LLM into a question-answering system.
Ask HN: High quality Python scripts or small libraries to learn from
12 projects | news.ycombinator.com | 19 Apr 2024

https://github.com/openai/openai-cookbook/blob/main/examples...
Collection of notebooks showcasing some fun and effective ways of using Claude
4 projects | news.ycombinator.com | 17 Apr 2024
OpenAI Cookbook: Techniques to improve reliability
1 project | news.ycombinator.com | 28 Feb 2024
OpenAI Cookbooks
1 project | news.ycombinator.com | 27 Dec 2023
How to fine tune vit/convnet to focus on the layout of the input room image and ignore other things ?
1 project | /r/computervision | 9 Dec 2023

It sounds like you are trying to tweak embeddings for similarity search. Rather than fine-tune the model's layers, you may want to try training a linear transformation the existing model's output embedding. Openai has a cookbook on how to do that. You will need some data though - but I think you can try it with ~20 pieces of synthetically generated data.
Best base model 1B or 7B for full finetuning
1 project | /r/LocalLLaMA | 6 Dec 2023

tutorial from OpenAI https://github.com/openai/openai-cookbook/blob/main/examples/Question_answering_using_embeddings.ipynb
Resources to learn ChatGPT and the OpenAI API
6 projects | /r/OpenAI | 3 Dec 2023

OpenAI Cookbook
OpenAI Cookbook
1 project | news.ycombinator.com | 17 Nov 2023
Another Major Outage Across ChatGPT and API
4 projects | news.ycombinator.com | 8 Nov 2023

OpenAI community repo with lots of examples: https://github.com/openai/openai-cookbook

What are some alternatives?

When comparing instructor-embedding and openai-cookbook you can also consider the following projects:

h2ogpt - Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://codellama.h2o.ai/

langchain - ⚡ Building applications with LLMs through composability ⚡ [Moved to: https://github.com/langchain-ai/langchain]

Nuggt - An Autonomous LLM Agent that runs on Wizcoder-15B

gpt4-pdf-chatbot-langchain - GPT4 & LangChain Chatbot for large PDF docs

vlite - fast vector database made in numpy

chatgpt-retrieval-plugin - The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.

easydiffusion - Easiest 1-click way to create beautiful artwork on your PC using AI, with no tech knowledge. Provides a browser UI for generating images from text prompts and images. Just enter your text prompt, and see the generated image.

askai - Command Line Interface for OpenAi ChatGPT

lit-gpt - Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed. [Moved to: https://github.com/Lightning-AI/litgpt]

gpt_index - LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data. [Moved to: https://github.com/jerryjliu/llama_index]

haystack - :mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.

txtai - 💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows

instructor-embedding vs h2ogpt openai-cookbook vs langchain instructor-embedding vs Nuggt openai-cookbook vs gpt4-pdf-chatbot-langchain instructor-embedding vs vlite openai-cookbook vs chatgpt-retrieval-plugin instructor-embedding vs easydiffusion openai-cookbook vs askai instructor-embedding vs lit-gpt openai-cookbook vs gpt_index instructor-embedding vs haystack openai-cookbook vs txtai

Compare instructor-embedding vs openai-cookbook and see what are their differences.

instructor-embedding

openai-cookbook

instructor-embedding

openai-cookbook

What are some alternatives?