Replace `RingBuffer` internals with `VecDeque` #78

paolobarbolini · 2024-11-29T11:20:35Z

This is a new attempt at replacing the internals of RingBuffer with VecDeque.

EDIT: I've also pushed the commit removing unsafe, since most of the regression is with the VecDeque and not with removing unsafe. It may have sense to be able to choose which implementation to use based on a feature gate.

This is still WIP, unsafe comments need to be added and a lot of comments need to be added to document the many hacks we're doing to implement extend_from_within in a performant way despite VecDeque not exposing it's own internals.

Compared to the previous attempts I think LLVM made many advancements, combined with VecDeque::extend specializations, better handling of MaybeUninit and the idea to use copy_bytes_overshooting should make the performance penalty much lower.

It should also be possible, through a feature gate, to replace all use of unsafe code and MaybeUninit with zero initialized memory for a smaller performance penalty than going back to the original Vec implementation from years ago.

Running benchmarks on dedicated hardware on a Ryzen 5900X the performance penalty of the current draft implementation is about 8% on the builtin bench.

Using a 150MB file that decompressed to 1 GB yielded this instead:

# master

decode_all_slice        time:   [1.7168 s 1.7227 s 1.7289 s]

# this branch

decode_all_slice        time:   [2.0467 s 2.0557 s 2.0644 s]
                        change: [+18.628% +19.329% +20.029%] (p = 0.00 < 0.05)
                        Performance has regressed.

# this branch with all unsafe removed

decode_all_slice        time:   [2.1711 s 2.1795 s 2.1873 s]
                        change: [+5.3898% +6.0207% +6.6085%] (p = 0.00 < 0.05)
                        Performance has regressed.

KillingSpark · 2024-11-30T12:38:02Z

Thanks for trying this out, it's important to check that having our own ringbuffer is still worth it.

A 20% performance hit would justify keeping it for the moment.

paolobarbolini force-pushed the vecdeque-as-ringbuffer branch 2 times, most recently from 4d75327 to 7a9d4bc Compare November 30, 2024 11:11

paolobarbolini added 5 commits November 30, 2024 16:46

drop dead code

0f9a946

NonNull -> Box

64018ea

vecdeque

d0fbe21

safe

6d61968

experiment: go back to unsafe

9f40a8b

paolobarbolini force-pushed the vecdeque-as-ringbuffer branch from 7a9d4bc to 9f40a8b Compare December 1, 2024 11:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace `RingBuffer` internals with `VecDeque` #78

Replace `RingBuffer` internals with `VecDeque` #78

paolobarbolini commented Nov 29, 2024 •

edited

Loading

KillingSpark commented Nov 30, 2024

Replace RingBuffer internals with VecDeque #78

Are you sure you want to change the base?

Replace RingBuffer internals with VecDeque #78

Conversation

paolobarbolini commented Nov 29, 2024 • edited Loading

KillingSpark commented Nov 30, 2024

Replace `RingBuffer` internals with `VecDeque` #78

Replace `RingBuffer` internals with `VecDeque` #78

paolobarbolini commented Nov 29, 2024 •

edited

Loading