Jupyter Notebook Embeddings

Open-source Jupyter Notebook projects categorized as Embeddings

Top 21 Jupyter Notebook Embedding Projects

  1. awesome-generative-ai

    A curated list of Generative AI tools, works, models, and references (by filipecalegario)

    Project mention: Top Courses and GitHub Repositories to Learn GenerativeAI Free | dev.to | 2024-08-17

    ✅Filipecalegario

  2. SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
  3. featureform

    The Virtual Feature Store. Turn your existing data infrastructure into a feature store.

    Project mention: 10 Open Source MLOps Projects You Didn’t Know About | dev.to | 2024-08-01

    Featureform The success of a machine learning model relies on the quality of data and, hence, the features fed to the model. However, in large organizations, members of one team may not be aware of good features developed by other teams in the organization. A feature store helps eliminate this problem by providing a central repository of features that are accessible to all the teams and individuals within an organization.

  4. generative-ai-docs

    Documentation for Google's Gen AI site - including the Gemini API and Gemma

    Project mention: Generate audio clips with Gemini 2.0 Flash | dev.to | 2024-12-16

    Google AI

  5. what_are_embeddings

    A deep dive into embeddings starting from fundamentals

    Project mention: What Are Embeddings? | news.ycombinator.com | 2024-11-28
  6. superlinked

    Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured and unstructured data.

    Project mention: Show HN: Superlinked – Vector Embeddings for Structured and Unstructured Data | news.ycombinator.com | 2024-12-02
  7. amazon-bedrock-samples

    This repository contains examples for customers to get started using the Amazon Bedrock Service. This contains examples for all available foundational models

    Project mention: A Journey of GenAI with AWS Bedrock based sample Images | dev.to | 2024-12-30

    AWS Samples contains pre-built examples to help customers get started with the Amazon Bedrock service.

  8. vectordb-recipes

    High quality resources & applications for LLMs, multi-modal models and VectorDBs

  9. Fast_Sentence_Embeddings

    Compute Sentence Embeddings Fast!

    Project mention: The Illustrated Word2Vec | news.ycombinator.com | 2024-04-19

    This is a great guide.

    Also - despite the fact that language model embedding [1] are currently the hot rage, good old embedding models are more than good enough for most tasks.

    With just a bit of tuning, they're generally as good at many sentence embedding tasks [2], and with good libraries [3] you're getting something like 400k sentence/sec on laptop CPU versus ~4k-15k sentences/sec on a v100 for LM embeddings.

    When you should use language model embeddings:

    - Multilingual tasks. While some embedding models are multilingual aligned (eg. MUSE [4]), you still need to route the sentence to the correct embedding model file (you need something like langdetect). It's also cumbersome, with one 400mb file per language.

    For LM embedding models, many are multilingual aligned right away.

    - Tasks that are very context specific or require fine-tuning. For instance, if you're making a RAG system for medical documents, the embedding space is best when it creates larger deviations for the difference between seemingly-related medical words.

    This means models with more embedding dimensions, and heavily favors LM models over classic embedding models.

    1. sbert.net

    2. https://collaborate.princeton.edu/en/publications/a-simple-b...

    3. https://github.com/oborchers/Fast_Sentence_Embeddings

    4. https://github.com/facebookresearch/MUSE

  10. cleora

    Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.

  11. examples

    Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, molecular search, etc. (by towhee-io)

    Project mention: BMF: Frame extraction acceleration- video similarity search with Pinecone | dev.to | 2024-05-10

    ! curl -L https://github.com/towhee-io/examples/releases/download/data/reverse_video_search.zip -O ! unzip -q -o reverse_video_search.zip

  12. kgtk

    Knowledge Graph Toolkit

  13. beyondllm

    Build, evaluate and observe LLM apps

    Project mention: FLaNK AI Weekly for 29 April 2024 | dev.to | 2024-04-29
  14. Research2Vec

    Representing research papers as vectors / latent representations.

  15. entity-embed

    PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

  16. embedding-encoder

    Scikit-Learn compatible transformer that turns categorical variables into dense entity embeddings.

  17. langchain-embeddings

    This repository demonstrates the construction of a state-of-the-art multimodal search engine, leveraging Amazon Titan Embeddings, Amazon Bedrock, and LangChain.

    Project mention: Desplegando una Aplicación de Embeddings Serverless con AWS CDK, Lambda y Amazon Aurora PostgreSQL | dev.to | 2024-09-18

    To generate embeddings for image/pdf with pgvector and Amazon Aurora.

  18. battle-of-the-semantics

    GraphRag vs Embeddings

    Project mention: Battle of the Semantics – GraphRag vs. Embeddings | news.ycombinator.com | 2024-07-10
  19. vector-search-azure-cosmos-db-postgresql

    This sample shows how to build vector similarity search on Azure Cosmos DB for PostgreSQL using the pgvector extension and the multi-modal embeddings APIs of Azure AI Vision.

    Project mention: Use HNSW index on Azure Cosmos DB for PostgreSQL for similarity search | dev.to | 2024-03-14

    In the Jupyter Notebook provided on my GitHub repository, you'll explore text-to-image and image-to-image search scenarios. You will use the same text prompts and reference images as in the Exact Nearest Neighbors search example, allowing for a comparison of the accuracy of the results.

  20. emotion-classifier

    An attention-based BiLSTM for emotion classification.

  21. ml

    Machine Learning, LLM and other Jupyter Notebooks and resources (by jankovicsandras)

  22. tax-retrieval-benchmark

    An implementation of the TaxRetrievalBenchmark task for the 🤗 Massive Text Embedding Benchmark (MTEB) framework.

    Project mention: Integrating the French Taxation Embedding Benchmark Task (Beta) into the MTEB | news.ycombinator.com | 2024-05-26
NOTE: The open source projects on this list are ordered by number of github stars. The number of mentions indicates repo mentiontions in the last 12 Months or since we started tracking (Dec 2020).

Jupyter Notebook Embeddings discussion

Log in or Post with

Jupyter Notebook Embeddings related posts

  • A Journey of GenAI with AWS Bedrock based sample Images

    1 project | dev.to | 30 Dec 2024
  • Road to becoming a GDE | The Google Developers Program

    1 project | dev.to | 9 Dec 2024
  • What Are Embeddings?

    1 project | news.ycombinator.com | 28 Nov 2024
  • Meet Stache Forcache, a Movember-themed AI created using Amazon PartyRock

    1 project | dev.to | 26 Nov 2024
  • AWS is killing customer AI apps without warning

    1 project | dev.to | 7 Nov 2024
  • Building an AI-Powered iOS Chat App with Amazon Bedrock and Swift

    3 projects | dev.to | 30 Oct 2024
  • البدء مع نماذج اللغة الكبيرة: كيف يمكن لـ Amazon Bedrock تعزيز رحلتك في الذكاء الاصطناعي

    1 project | dev.to | 6 Oct 2024
  • A note from our sponsor - SaaSHub
    www.saashub.com | 19 Jan 2025
    SaaSHub helps you find the best software and product alternatives Learn more →

Index

What are some of the best open-source Embedding projects in Jupyter Notebook? This list will help you:

# Project Stars
1 awesome-generative-ai 2,652
2 featureform 1,826
3 generative-ai-docs 1,806
4 what_are_embeddings 988
5 superlinked 846
6 amazon-bedrock-samples 694
7 vectordb-recipes 667
8 Fast_Sentence_Embeddings 618
9 cleora 487
10 examples 471
11 kgtk 368
12 beyondllm 270
13 Research2Vec 198
14 entity-embed 147
15 embedding-encoder 41
16 langchain-embeddings 22
17 battle-of-the-semantics 13
18 vector-search-azure-cosmos-db-postgresql 10
19 emotion-classifier 6
20 ml 2
21 tax-retrieval-benchmark 1

Sponsored
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com

Did you know that Jupyter Notebook is
the 13th most popular programming language
based on number of references?