PyAV
fast_vector_similarity
PyAV | fast_vector_similarity | |
---|---|---|
3 | 7 | |
2,279 | 324 | |
2.2% | - | |
9.2 | 7.2 | |
8 days ago | 9 months ago | |
Cython | Rust | |
BSD 3-clause "New" or "Revised" License | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
PyAV
- Some Reasons to Avoid Cython
-
need advice on what frameworks to use w ffmpeg
Personally I would recommend using the PYAV bindings directly, if you want any sort of control over different codecs, profiles, pixel formats etc.
-
[P] DeFFcode: A High-performance FFmpeg based Video-Decoder Python Library for fast and low-overhead decoding of a wide range of video streams into 3D NumPy frames.
There are a good amount of FFMPEG wrappers out there, decord, pyav, and MoviePy are probably the most popular. I'm sure all of these are fine, but they seem like they'd be best suited for something like a web backend for a startup that's getting off the ground or something else where latency isn't a huge issue.
fast_vector_similarity
-
SentenceTransformers: Python framework for sentence, text and image embeddings
Yes, check out my library for vector similarity that has various other measures which are more discriminative:
https://github.com/Dicklesworthstone/fast_vector_similarity
pip install fast_vector_similarity
-
Show HN: Neum AI – Open-source large-scale RAG framework
Got it. I'd encourage you to expose more of that functionality at the level of your application if possible. I think there is a lot of potential in using more than just cosine similarity, especially when there are lots of candidates and you really want to sharpen up the top few recommendations to the best ones. You might find this open-source library I made recently useful for that:
https://github.com/Dicklesworthstone/fast_vector_similarity
I've had good results from starting with cosine similarity (using FAISS) and then "enriching" the top results from that with more sophisticated measures of similarity from my library to get the final ranking.
-
Some Reasons to Avoid Cython
You can see how I did something similar in my library here:
https://github.com/Dicklesworthstone/fast_vector_similarity/...
Basically you use ndarray instead of numpy, try to vectorize anything you can, and for the for loops that can’t be vectorized, you can use rayon to do them in parallel.
- FLaNK Stack Weekly 28 August 2023
- Fast Vector Similarity Library, Useful for Working With Llama2 Embedding Vectors
-
Show HN: Fast Vector Similarity Using Rust and Python
Yeah, like the other commenter said, everything is in this file here:
https://github.com/Dicklesworthstone/fast_vector_similarity/...
If you also make your project using Rust and Maturin, you can literally just copy and paste that into your project because it's totally generic, and if the repo is public, GitHub will just run it all for you for free.
The only thing is you need to create an account on PyPi (pip) and add 2-Factor Auth so you can generate an API key. Then you go into the repo settings and go to secrets, and create a Github Actions secret with the name PYPI_API_TOKEN and make the value your PyPi token. That's it! It will not only compile all the wheels for you but even upload the project to PyPi for you using the settings found in your pyproject.toml file, like this:
https://github.com/Dicklesworthstone/fast_vector_similarity/...
What are some alternatives?
decord - An efficient video loader for deep learning with smart shuffling that's super easy to digest
simsimd
moviepy - Video editing with Python
swiss_army_llama - A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for various file types through textract.
deffcode - A cross-platform High-performance FFmpeg based Real-time Video Frames Decoder in Pure Python 🎞️⚡
np-sims - numpy ufuncs for vector similarity
uvloop - Ultra fast asyncio event loop.
QTVR - Tools for QTVR 1 files
ta-lib-python - Python wrapper for TA-Lib (http://ta-lib.org/).
llama_embeddings_fastap
tenforce - Type enforcement for Python
DoctorGPT - 💻📚💡 DoctorGPT provides advanced LLM prompting for PDFs and webpages.