apultra
PyFastPFor
Our great sponsors
apultra | PyFastPFor | |
---|---|---|
1 | 2 | |
96 | 56 | |
- | - | |
3.3 | 4.6 | |
12 months ago | 6 months ago | |
C | C++ | |
GNU General Public License v3.0 or later | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
apultra
PyFastPFor
-
Time-Series Compression Algorithms
One notable omission from this piece is a technique to compress integer time series with both positive and negative values.
If you naively apply bit-packing using the Simple8b algorithm, you'll find that negative integers are not compressed. This is due to how signed integers are represented in modern computers: negative integers will have their most significant bit set [1].
Zigzag encoding is a neat transform that circumvents this issue. It works by mapping signed integers to unsigned integers so that numbers with a small absolute value can be encoded using a small number of bits. Put another way, it encodes negative numbers using the least significant bit for sign. [2]
If you're looking for a quick way to experiment with various time series compression algorithm I highly recommend Daniel Lemire's FastPFor repository [3] (as linked in the article). I've used the Python bindings [4] to quickly evaluate various compression algorithms with great success.
Finally I'd like to humbly mention my own tiny contribution [5], an adaptation of Lemire's C++ Simple8b implementation (including basic methods for delta & zigzag encoding/decoding).
I used C++ templates to make the encoding and decoding routines generic over integer bit-width, which expands support up to 64 bit integers, and offers efficient usage with smaller integers (eg 16 bit). I made a couple other minor tweaks including support for arrays up to 2^64 in length, and tweaking the API/method signatures so they can be used in a more functional style. This implementation is slightly simpler to invoke via FFI, and I intend to add examples showing how to compile for usage via JS (WebAssembly), Python, and C#. I threw my code up quickly in order to share with you all, hopefully someone finds it useful. I intend to expand on usage examples/test cases/etc, and am looking forward to any comments or contributions.
[1] https://en.wikipedia.org/wiki/Signed_number_representation
[2] https://en.wikipedia.org/wiki/Variable-length_quantity#Zigza...
[3] https://github.com/lemire/FastPFor
[4] https://github.com/searchivarius/PyFastPFor
[5] https://github.com/naturalplasmoid/simple8b-timeseries-compr...
- The big-load anti-pattern
What are some alternatives?
pistorm - 68k Hardware Emulator
simple8b-timeseries-compression
unzx0_68000 - Free, zlib licensed ZX0 decompressor for the 68000
lib7zip - c++ library wrapper of 7zip
lzsa - Byte-aligned, efficient lossless packer that is optimized for fast decompression on 8-bit micros
simple8b-timeseries-compr
64tass - 64tass - cross assembler for 6502 etc. microprocessors - by soci/singular - [git clone from the original sourceforge repo]
banyan
CROSS-LIB - CROSS LIB - A universal 8-bit library and some games built with it
pretty6502 - A pretty printer for 6502, Z80, CP1610, TMS9900, and 8088 assembler code
atari64 - Commodore 64 OS running on Atari 8-bit hardware
salvador - A free, open-source compressor for the ZX0 format