lz4_flex
squashfs-tools-ng
lz4_flex | squashfs-tools-ng | |
---|---|---|
13 | 7 | |
411 | 187 | |
- | - | |
6.5 | 8.0 | |
21 days ago | about 1 month ago | |
Rust | C | |
MIT License | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
lz4_flex
-
Hetzner ARM cax11 vs Intel cx11 Benchmark
I run a benchmark based on my LZ4 implementation (de/compressor) between cax11 and cx11 since the instances cost the same. The code doesn't use any SIMD.
-
lz4_flex 0.11: Gainzzzzz Unleashed!: Performance Improvements Detailed in Blogpost (LZ4 De/compression)
By the way, the PR removing bounds checks from extend_from_within_overlapping is up: https://github.com/PSeitz/lz4_flex/pull/141
- lz4_flex (fast LZ4 de/compression) 0.10 released, now with legacy frame support ~ also 1Mio downloads 🎉
-
ZAP: a VERY fast zip alternative, written in rust!
Zap uses LZ4 as it's primary compression algorithm, although I might add ZSTD as an option too. The LZ4 crate i'm using goes over the compression better than I can in the span of a comment. Check it out here.
-
lz4_flex 0.9 released
I wrote something here how hc could work: https://github.com/PSeitz/lz4_flex/issues/21
-
lz4_flex 0.8 released with support for frame format and major performance improvements
Yes, the updated benchmarks are here: https://github.com/pseitz/lz4_flex#results-v080-17-05-2021-safe-decode-and-safe-encode-off
- Lz4_flex – fast LZ4 implementation in Rust
-
LZ4, an Extremely Fast Compression Algorithm
I ported the block format to Rust matching the C implementation in performance and ratio.
https://github.com/pseitz/lz4_flex
-
lz4_flex 0.7.2 reaches parity with cpp reference implementation on speed and ratio
Following this change count_same_bytes is unsound - it offsets the pointer by the value of candidate without any bounds checks, which may result in out-of-bounds access.
- lz4_flex 0.7 supports no_std (thanks @coolreader18), 32bit and is dependency-free
squashfs-tools-ng
-
C Strings and my slow descent to madness
... except that that is also subtly broken.
It works if you write multiple UTF-8 code-units in one go, but breaks if you send them in several writes, or if you use the ANSI API (with the A suffix). Guess what the Windows implementation of stdio (printf and friends) does.
I already had some fun with this: https://github.com/AgentD/squashfs-tools-ng/issues/96#issuec...
And we didn't even discuss command line argument passing yet :-)
I tried to test it with the only other two languages I know besides English: German and Mandarin. Specifically also, because the later requires multi-byte characters to work. Getting this to work at all in a Windows terminal on an existing, German Windows 7 installation was an adventure on it's own.
Turns out, trying to write language agnostic command line applications on Windows is a PITA.
-
Getting the maximum of your C compiler, for security
IIRC fanalyzer is a fairly recent addition to gcc. Has it become reasonable usable yet?
I recall getting a bit excited when I first read about it, but the results I got where rather bizarre (e.g. every single function that allocated memory and returned a pointer to it was labeled as leaking memory).
It did the fun exercise myself once to riffle through the gcc manpage, cobble together warning flags and massage them into autoconf[1][2].
There is a very handy m4 script in the util-linux source for testing supported warning flags[3].
[1] https://git.infradead.org/mtd-utils.git/blob/HEAD:/configure...
[2] https://github.com/AgentD/squashfs-tools-ng/blob/master/conf...
[3] https://github.com/karelzak/util-linux/blob/master/m4/compil...
-
Squashfs turning 20, Squashfs tools 4.5 released
> Honestly I think you could be a little more respectful of the project that inspired yours.
I do. I had a lot of great "Huh? That's clever!" moments while reverse engineering the format and formed a mental image of a clearly brilliant programmer who managed to squeeze the last bits out of some data structures using really clever tricks that I myself probably wouldn't have come up with. During that time I gained a lot of respect for the project and the author.
Also, please don't forget: the whole project is the filesystem, the tools are just a part of that. I care about this project, which is why I decided to start this effort in the first place. Which I explicitly did not advertise as a replacement, but an augmentation (see [2]).
> I'd be angry too ... Definitely understandable.
Yes, I agree! And I can understand why in the heat of the moment you might write something angry and threatening. But certainly not if you've had a few weeks time to calm down and think things over.
> And you plagiarized part of his readme.
https://github.com/plougher/squashfs-tools/blob/master/RELEA...
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...
https://github.com/AgentD/squashfs-tools-ng/blob/master/READ...
Oh yes? Which part?
> ... calling it spaghetti code (which isn't immediately verifiable)
Here you go, have fun: https://github.com/plougher/squashfs-tools/blob/master/squas...
However, I cannot blame anyone here, I totally get how those things happen and have witnessed it myself in action:
You write a simple tool supporting a larger project. It's written by the seat of your pants without much planning, since it's not big and does one simple job. Then it gets used in production, eventually requirements change, other people pile on patches, but try to keep the diff small, so it's reviewable and it receives maybe a little less care than the actual project it supports. Nobody bothers to overhaul it or write documentation because, hey, it works, and any large changes might risk breaking things.
Even if nobody is to blame for it, the end result is still the same: an undocumented mess that is hard to wrap your head around if you aren't the original author, who is the only one with the bigger picture.
I tried for roughly a week to pull the code (there are some more files than this and some of the inter dependencies are nasty) apart into stacked utility libraries and a pure command line parsing front end, with the hopes to maybe get this upstream once it is done. I gave up and decided that at this point I understood enough about the format to start afresh and not touch what I believed to be an unmaintained mess.
-
The Byte Order Fiasco
FWIW there is an on various BSDs that contains "beXXtoh", "leXXtoh", "htobeXX", "htoleXX" where XX is a number of bits (16, 32, 64).
That header is also available on Linux, but glibc (and compatible libraries) put named it instead.
See: man 3 endian (https://linux.die.net/man/3/endian)
Of course it gets a bit hairier if the code is also supposed to run on other systems.
MacOS has OSSwapHostToLittleIntXX, OSSwapLittleToHostIntXX, OSSwapHostToBigIntXX and OSSwapBigToHostIntXX in .
I'm not sure if Windows has something similar, or if it even supports running on big endian machines (if you know, please tell).
My solution for achieving some portability currently entails cobbling together a "compat.h" header that defines macros for the MacOS functions and including the right headers. Something like this:
https://github.com/AgentD/squashfs-tools-ng/blob/master/incl...
This is usually my go-to-solution for working with low level on-disk or on-the-wire binary data structures that demand a specific endianness. In C I use "load/store" style functions that memcpy the data from a buffer into a struct instance and do the endian swapping. The copying is also necessary because the struct in the buffer may not have proper alignment.
In C++ code, all of this can of course be neatly stowed away in a special class with overloaded operators that transparently takes care of everything and "decays" into a single integer and exactly the above code after compilation, but is IMO somewhat cleaner to read and adds much needed type safety.
-
Tar is an ill-specified format
I once foolishly thought, I'll write a tar parser because, "how hard can it be" [1].
I simply tried to follow the tar(5) man page[2], and got a reference test set from another website posted previously on HN[3].
Along the way I discovered that NetBSD pax apparently cannot handle the PAX format[3] and my parser inadvertently uncovered that git-archive was doing the checksums wrong, but nobody noticed because other tar parsers were more lax about it[4].
As the article describes (as does the man page), tar is actually a really simple format, but there are just so many variants to choose from.
Turns out, if you strive for maximum compatibility, it's easiest to stick to what GNU tar does. If you think about it, IMO in many ways the GNU project ended up doing "embrace, extend, extinguish" with Unix.
[1] https://github.com/AgentD/squashfs-tools-ng/tree/master/lib/...
[2] https://www.freebsd.org/cgi/man.cgi?query=tar&sektion=5
[3] https://mgorny.pl/articles/portability-of-tar-features.html
[4] https://www.spinics.net/lists/git/msg363049.html
-
LZ4, an Extremely Fast Compression Algorithm
A while ago I did some simplistic SquashFS pack/unpack benchmarks[1][2]. I was primarily interested in looking at the behavior of my thread-pool based packer, but as a side effect I got a comparison of compressor speed & ratios over the various available compressors for my Debian test image.
I must say that LZ4 definitely stands out for both compression and uncompression speed, while still being able to cut the data size in half, making it probably quite suitable for life filesystems and network protocols. Particularly interesting was also comparing Zstd and LZ4[3], the former being substantially slower, but at the same time achieving a compression ratio somewhere between zlib and xz, while beating both in time (in my benchmark at least).
[1] https://github.com/AgentD/squashfs-tools-ng/blob/master/doc/...
[2] https://github.com/AgentD/squashfs-tools-ng/blob/master/doc/...
[3] https://github.com/AgentD/squashfs-tools-ng/blob/master/doc/...
What are some alternatives?
zfs - OpenZFS on Linux and FreeBSD
squashfs-tools - tools to create and extract Squashfs filesystems
LZ4 - Extremely Fast Compression algorithm
7-Zip-zstd - 7-Zip with support for Brotli, Fast-LZMA2, Lizard, LZ4, LZ5 and Zstandard
density - Superfast compression library
dracut - dracut the event driven initramfs infrastructure
zstd - Zstandard - Fast real-time compression algorithm
genext2fs - genext2fs - ext2 filesystem generator for embedded systems
zstd-rs - zstd-decoder in pure rust