murmurhash
py-spy
Our great sponsors
murmurhash | py-spy | |
---|---|---|
2 | 25 | |
42 | 11,850 | |
- | - | |
5.0 | 6.4 | |
6 months ago | 19 days ago | |
C++ | Rust | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
murmurhash
-
Is anyone using PyPy for real work?
If you have very large dicts, you might find this hash table I wrote for spaCy helpful: https://github.com/explosion/preshed . You need to key the data with 64-bit keys. We use this wrapper around murmurhash for it: https://github.com/explosion/murmurhash
There's no docs so obviously this might not be for you. But the software does work, and is efficient. It's been executed many many millions of times now.
-
Data Ingestion - Build Your Own "Map Reduce"?
Some notes: We don't need Sha256 and not evey base64; nothing will happen if keys will not distribute very equally. we could take MMH3; googling "python murmurhash" gives 2 interesting results; and since both use the same cpp code, let's take the one with most stars Other options would be to simply do (% NUM_SHARDS) or even shift right (however must have shards count == power of 2).
py-spy
- Minha jornada de otimizaĆ§Ć£o de uma aplicaĆ§Ć£o django
- Graphical Python Profiler
-
Grasshopper ā An Open Source Python Library for Load Testing
For CPU cycles, py-spy[0] is getting more and more used. For RAM, I would like to known too...
[0] -- https://github.com/benfred/py-spy
-
Debugging a Mixed Python and C Language Stack
Theres also Py Spy, a profiling tool that can generate flame charts containing a mix of python and C (or C++) calls.
https://github.com/benfred/py-spy
It's worked really well for my needs
-
python to rust migration
You should profile your consumer to check the bottlenecks. You can use the excellent py-spy(written in Rust). IMO a few usage of Numba there and there should solve your performance issues.
-
Has anyone switched from numpy to Rust?
So as a first step you'll want to profile your program to figure out where it's slow, and hopefully that'll also tell you why it's slow. I'm the (biased) author of the Sciagraph profiler which is designed for this sort of application (https://sciagraph.com) but you can also try py-spy, which isn't as well designed for data processing/analysis applications (e.g. it won't visualize parallelism at all) but can still be informative (https://github.com/benfred/py-spy). Both are written in Rust ;)
-
Trace your Python process line by line with minimal overhead!
Any advantages/disadvantages compared to py-spy [1]?
[1]: https://github.com/benfred/py-spy
-
Python 3.11 delivers.
Python profiling is enabled primarily through cprofile, and can be visualized with help of tools like snakeviz (output flame graph can look like this). There are also memory profilers like memray which does in-depth traces, or sampling profilers like py-spy.
-
Tales of serving ML models with low-latency
A good profiler would be https://github.com/benfred/py-spy . If you run your app/benchmark with it, it should be able to draw a flamegraph telling you where the majority of time is spent. The info here is quite fine grained so it would already tell you where the bottleneck is. Without a full-fledged profiler you can also measure the timings in various parts of the code to understand where the bottleneck is.
-
Profiling a Python library written in Rust (Maturin)
Might be worth raising an issue on py-spy (a python profiler written in rust which "supports profiling native python extensions written in languages like C/C++ or Cython" to see if that can close the loop.
What are some alternatives?
mmh3 - Python extension for MurmurHash (MurmurHash3), a set of fast and robust hash functions.
pyflame
mrjob - Run MapReduce jobs on Hadoop or Amazon Web Services
pyinstrument - š“Ā Call stack profiler for Python. Shows you why your code is slow!
preshed - š„ Cython hash tables that assume keys are pre-hashed
python-uncompyle6 - A cross-version Python bytecode decompiler
python-mysql-replication - Pure Python Implementation of MySQL replication protocol build on top of PyMYSQL
memory_profiler - Monitor Memory usage of Python code
sparc-curation - code and files for SPARC curation workflows
icecream - š¦ Never use print() to debug again.
psycopg2cffi - Port to cffi with some speed improvements
line_profiler