Open
Description
Although asymptotically BWT is the most expensive operation during compression, only 40% is spent during BWT. There are some hot spots which are potentially easy to fix:
-
ribzip2::lib::block::zle::zle_transform
spends 15% inHashSet
andHashMap
operations. Both can be replaced by operations which finish in constant time. Also vector operations might be improved up by pre-allocating sufficient memory. -
ribzip2::lib::block::block_data::generate_block_data
spends 15% of the time on the final encoding step, most of the time during inefficient (linear) access to the coding table. - run
ribzip2::lib::block::huffman::package_merge::package_merge
only if code length exceeds 20 bits - saves 1,5% of runtime