Exploring Methods to Improve Text Chunking in RAG Models (and other things...)

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

SaaSHub - Software Alternatives and Reviews

SaaSHub helps you find the best software and product alternatives

www.saashub.com

featured

vectorboard

2 43 5.5 Python

Open Source Embeddings Optimisation and Eval Framework for RAG/LLM Applications. Documentations at https://docs.vectorboard.ai/introduction

Hi, about chunking, if the text is structured (markdown or html), you can take headding and paragraph as a chunking unit, but the result is also affected by the applied embeddings, which can be evaluated separately first, for standard chunking methods with different chunk lengths, for example with this tool https://github.com/VectorBoard/vectorboard.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

AI Grant Traction in OSS Startups

5 projects | dev.to | 1 Feb 2024
Vector Databases: A Technical Primer [pdf]

7 projects | news.ycombinator.com | 12 Jan 2024
Chroma – the open-source embedding database

1 project | news.ycombinator.com | 11 Jan 2024
Show HN: Embeddings Solution for Personal Journal

2 projects | news.ycombinator.com | 1 Nov 2023
Chroma DB Random Seg Faults

1 project | news.ycombinator.com | 5 Sep 2023

Exploring Methods to Improve Text Chunking in RAG Models (and other things...)

This page summarizes the projects mentioned and recommended in the original post on /r/GPT3
embedding-evaluation Embeddings eval hyperparameter-optimization llms
Post date: 22 Oct 2023

vectorboard

InfluxDB

Related posts

AI Grant Traction in OSS Startups

Vector Databases: A Technical Primer [pdf]

Chroma – the open-source embedding database

Show HN: Embeddings Solution for Personal Journal

Chroma DB Random Seg Faults

Exploring Methods to Improve Text Chunking in RAG Models (and other things...)

This page summarizes the projects mentioned and recommended in the original post on /r/GPT3 embedding-evaluation Embeddings eval hyperparameter-optimization llms Post date: 22 Oct 2023

vectorboard

InfluxDB

Related posts

AI Grant Traction in OSS Startups

Vector Databases: A Technical Primer [pdf]

Chroma – the open-source embedding database

Show HN: Embeddings Solution for Personal Journal

Chroma DB Random Seg Faults

This page summarizes the projects mentioned and recommended in the original post on /r/GPT3
embedding-evaluation Embeddings eval hyperparameter-optimization llms
Post date: 22 Oct 2023