murmurhash
preshed
Our great sponsors
murmurhash | preshed | |
---|---|---|
2 | 1 | |
42 | 78 | |
- | - | |
5.0 | 4.1 | |
6 months ago | 6 months ago | |
C++ | Cython | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
murmurhash
-
Is anyone using PyPy for real work?
If you have very large dicts, you might find this hash table I wrote for spaCy helpful: https://github.com/explosion/preshed . You need to key the data with 64-bit keys. We use this wrapper around murmurhash for it: https://github.com/explosion/murmurhash
There's no docs so obviously this might not be for you. But the software does work, and is efficient. It's been executed many many millions of times now.
-
Data Ingestion - Build Your Own "Map Reduce"?
Some notes: We don't need Sha256 and not evey base64; nothing will happen if keys will not distribute very equally. we could take MMH3; googling "python murmurhash" gives 2 interesting results; and since both use the same cpp code, let's take the one with most stars Other options would be to simply do (% NUM_SHARDS) or even shift right (however must have shards count == power of 2).
preshed
-
Is anyone using PyPy for real work?
If you have very large dicts, you might find this hash table I wrote for spaCy helpful: https://github.com/explosion/preshed . You need to key the data with 64-bit keys. We use this wrapper around murmurhash for it: https://github.com/explosion/murmurhash
There's no docs so obviously this might not be for you. But the software does work, and is efficient. It's been executed many many millions of times now.
What are some alternatives?
mmh3 - Python extension for MurmurHash (MurmurHash3), a set of fast and robust hash functions.
python-mysql-replication - Pure Python Implementation of MySQL replication protocol build on top of PyMYSQL
mrjob - Run MapReduce jobs on Hadoop or Amazon Web Services
pymssql - Official home for the pymssql source code.
sparc-curation - code and files for SPARC curation workflows
legion - The Legion Parallel Programming System
psycopg2cffi - Port to cffi with some speed improvements
MurMurHash - This little tool is to calculate a MurmurHash value of a favicon to hunt phishing websites on the Shodan platform.
python-mysql-replicati