telekinesis
DBoW2
telekinesis | DBoW2 | |
---|---|---|
12 | 2 | |
16 | 824 | |
- | - | |
5.6 | 0.0 | |
28 days ago | over 2 years ago | |
Python | C++ | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
telekinesis
-
Show HN: Sort and Filter Ask HN Who's Hiring by LLM-Embedding Proximity
https://payperrun.com/%3E/search?displayParams={%22q%22:%22S...
(There are quite a few, you might want to filter by date!)
-
Ask HN: Who is hiring? (November 2023)
Hey everyone, I just made this thread easier to search through here:
https://payperrun.com/%3E/search?displayParams={%22q%22:%22D...
It uses LLM embeddings to sort postsby semantic proximity, but you can also filter out posts with comma separated values like this:
-
Ask HN: What do you regret doing or not doing in your 30s?
https://news.ycombinator.com/item?id=33118584
[Shameless plug: I found all these on my llm-embedding based search engine I launched today: https://payperrun.com/%3E/search?displayParams={%22q%22:%22A...
It's much better than HN's default search: https://hn.algolia.com/?q=Ask+HN%3A+What+do+you+regret+doing... ]
-
My thoughts on starting an online business as someone who's never done it before
https://payperrun.com/%3E/search?displayParams={%22q%22:%22A...
-
We should promote more personal indexing, rather than algorhythmic indexing
There have been a few attempts at a crowdsourced-rank search engine (which is similar to what you're suggesting - people indexing the content), but it seems to be a hard cookie, most of the examples of similar ideas I could find on ProductHunt or ShowHN seem dead:
https://payperrun.com/%3E/search?displayParams={%22q%22:%22c...
(btw, I just launched this llm-embedding based search service that lets you check if a startup idea has already been tried/failed).
I don't know if this idea has a higher death rate than the baseline, but my guess is Google/PageRank is good enough for most use-cases, and then if you want quality sources, you can just follow them on YouTube, Twitter, Instagram, etc. Wait, maybe I shouldn't try to compete with Google?
-
Show HN: An Embedding-Based Search Service over ShowHN, AskHN, GitHub, More
I like the section on how it works: https://payperrun.com/%3E/search?display=How%20this%20servic...
The vector search is using https://lancedb.com/ and OpenAI embeddings.
-
Embeddings: What they are and why they matter
Behaves as I expected now!
I went here looking for more info about payperrun https://payperrun.com/%3E/welcome and clicked on the "Spotlight" section and saw 4 popups blocked - I never see popups anywhere these days and have to admit that sends me away pretty quickly.
- Show HN: Payperrun.com – A New Way to Monetize Your Code
- telekinesis: Just-in-time SDKs
- Show HN: Just-in-Time SDKs
DBoW2
-
Embeddings: What they are and why they matter
Not quite the same application, but in computer vision and visual SLAM algorithms (to construct a map of your surrounding using a camera) embedding have become the de-facto algorithm to perform place-recognition ! And it's very similar to this article. It is called "bag-of-word place recognition" and it really became the standard, used by absolutely every open-source library nowadays.
The core idea is that each image is passed through a feature-extractor-descriptor pipeline and is 'embedded' in a vector containing the N top features. While the camera moves, a database of images (called keyframes) is created (images are stored as much-lower dimensional vectors). Again while the camera moves, all images are used to query the database, something like cosine-similarity is used to retrieve the best match from the vector database. If a match happened, a stereo-constraints can be computed betweeen the query image and the match, and the software is able to update the map.
[1] is the original paper and here's the most famous implementation: https://github.com/dorian3d/DBoW2
[1]: https://www.google.com/search?client=firefox-b-d&q=Bags+of+B...
-
[D] Fastest SIFT Descriptors Matching with Database of SIFT Descriptors
This library is the most widely used bag of words implementation. Which is the standard for feature retrieval. There might be more advanced methods but you gotta do it yourself. It can also be used with non-sift descriptors. https://github.com/dorian3d/DBoW2
What are some alternatives?
chasr-server - End-To-End Encrypted GPS Tracking Service
faiss - A library for efficient similarity search and clustering of dense vectors.
terra.py - Python SDK for Terra
supabase - The open source Firebase alternative.
pyxet - Python SDK for XetHub
vectordb - A minimal Python package for storing and retrieving text using chunking, embeddings, and vector search.
bert - TensorFlow code and pre-trained models for BERT
roadmap - This is the public roadmap for Salesforce Heroku services.
marqo - Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
llm-cluster - LLM plugin for clustering embeddings