-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lzbench 2.0 - Textual Benchmarking of 6 corpora (2MB, 4MB, 15MB, 20MB, 42MB, 320MB) #187
Comments
It's interesting lzav -2 is on par or outperforms zlib -9 on larger files. lzav is pure LZ77, and its compression algorithms were tuned for speed, so its stream format can probably be utilized even better. Hopefully there are no issues with benchmarking. lzav -2 decompressing faster than lzav -1 is a bit suspicious to me. As far as Oodle and LZTurbo go, we'll probably never know if they are actually faster than lzsse2 on decompression. These are commercial algorithms and advertised benchmarks may not be reliable. I do not think they'll be open-sourced ever. |
@Sanmayce: as far as I know Nakamichi is your codec. Would you mind to add simple statement what the licence of the software is? |
There is nothing 'suspicious', not surprising about it. Exactly same relation applies to lz4hc, zlib, lzma, lizard (for the most part), ucl, tornado, yappy, practically all LZ-scheme codecs. |
Well, as an author of lzav I generally rarely witnessed -2 being faster than -1 at decompression. It may be faster maybe by 2%, and not by 10%. The contradiction in your statement is exactly about having "more matches" that increases the amount of "branching" per byte in the stream. Realistically, memory copy in LZ77 decompression is a very minor factor - it's all about branching. |
Well, probably I'm incorrect and there can be edge cases. |
Brachylogy. You can always test what is the reason for that, but you can't deny the observation, that virtually all LZ-scheme codecs behave this way. You can make your compressor to use only short matches, and second only long ones, if you keep same compression ratio then you can test the hypothesis that length of the match matters. It's not always the case. On ARM lzav is rather slow (relatively), especially on ARM32 [1] and `-2' is much slower than `-1'. On ARM64 [2] is almost as fast as LZ4 but levels behave the same. |
What matters it's both the number of matches and average match length. Many short matches yields slower decompression - that's what I usually observe with silesia dataset. Thanks for the links to the benchmarks. However, not too useful as there is no machine info. What's the difference between Cortex-A53 and ARM64? |
It would be a good idea for someone to combine Silesia and Manzini datasets. Computing evolved and now benchmarking larger datasets should not be an issue. |
You're welcome. Here are all of them.
That's the problem with ARM - they don't have human readable, string CPUID, as if it was big problem to add few transistors with one instruction.
ARM32/64 comes from CI tests, Cortex-A53 is my mobile.
If you mean CI tests then you are wrong - just check time of these tests - 40 min for 8 MB data set. I don't know Manzini corpus/data set, but Silesia is 200 MB, 25 times more than lzbench.exe. It also means 25 times more time to test it; that's 1000 minutes, 17 hours. In reality even more. If you pay for it, sure they won't care. But it's still looong time. Of course you can test it with your codec only, in your repo, that's much more sensible and manageable. |
Why make it so aggressive, eh? CI tests use 8 MB files. Of course, I'm not talking about CI tests. |
?
7, to be more precise. It uses it's own executable as input (
I don't know then. |
Testmachine: ThinkPad L490, i7-8565U, DDR4 2400 MHz
OS: Linux Fedora 39
Mode: Performance Mode, as superuser
Compiler: gcc version 13.2.1 20231205 (Red Hat 13.2.1-6) (GCC)
TOP 3 Pareto frontiers (compressors for which no other compressors BOTH compress smaller and decompress faster):
zstd 1.5.6 -22
lizard 2.1 -39
lzsse2 2019-04-18 -17
'Wolfeye' is 1766/1091=1.61x faster-but-weaker than zstd.
TOP 3 Pareto frontiers (compressors for which no other compressors BOTH compress smaller and decompress faster):
zstd 1.5.6 -22
nakamichi okamigan
lzsse2 2019-04-18 -17
'Wolfeye' is 1968/1150=1.71x faster-but-weaker than zstd.
TOP 3 Pareto frontiers (compressors for which no other compressors BOTH compress smaller and decompress faster):
zstd 1.5.6 -22
nakamichi okamigan
lzsse2 2019-04-18 -17
'Wolfeye' is 1267/650=1.94x faster-but-weaker than zstd.
TOP 3 Pareto frontiers (compressors for which no other compressors BOTH compress smaller and decompress faster):
zstd 1.5.6 -22
nakamichi okamigan
lzsse2 2019-04-18 -17
'Wolfeye' is 2532/1481=1.70x faster-but-weaker than zstd.
TOP 3 Pareto frontiers (compressors for which no other compressors BOTH compress smaller and decompress faster):
zstd 1.5.6 -22
nakamichi okamigan
lzsse2 2019-04-18 -17
'Wolfeye' is 1347/877=1.53x faster-but-weaker than zstd.
TOP 3 Pareto frontiers (compressors for which no other compressors BOTH compress smaller and decompress faster):
zstd 1.5.6 -22
nakamichi okamigan
lzsse2 2019-04-18 -17
'Wolfeye' is 2925/2235=1.30x faster-but-weaker than zstd.
The brutally fast Oodle and LZTurbo are missing, one day when open-sourced they will redefine the roster.
The text was updated successfully, but these errors were encountered: