Our great sponsors
-
code-indexer-loop
Code Indexer Loop is a Python library for indexing and retrieving source code files through an integrated vector database that's continuously and efficiently updated.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Sweep is mentioned as attribution in multiple place a) https://github.com/definitive-io/code-indexer-loop#attributi... b) https://github.com/definitive-io/code-indexer-loop/blob/fd9d...
The difference is packaging it as a consumable PyPI package that can easily be used in a project (they even call out for separating this out into a stand alone project but that they lack the time to do so): https://docs.sweep.dev/blogs/chunking-2m-files#future-
In addition, we expand and fix the implementation, for example it now supports limiting on token count instead of character count, and we fix some white space inconsistencies in parsing/chunk reconstruction.
Queries on https://github.com/pypa/flit/tree/main/flit_core/flit_core (omitted tests/)
(Pdb) print(indexer.query("def normalize_dist_name(name: str, version: str) -> str:"))
Related posts
- Python library for indexing and retrieving source code files through an integrated vector database (not mine)
- Embeddings are a good starting point for the AI curious app developer
- Show HN: Chromem-go – Embeddable vector database for Go
- Vector Databases: A Technical Primer [pdf]
- SQLite vs. Chroma: A Comparative Analysis for Managing Vector Embeddings