Friday, May 8, 2020

Arrays vs. Ring Buffers: A Performance Comparison in Rust

The VecDeque collection in Rust implements a double-ended queue with O(1) operations to push and pop on either end. (Other language libraries have similar data structures, such as ArrayDeque in Java.) It also provides O(1) random access to individual elements.

These ring buffers are implemented behind-the-scenes with an array and two index variables: a start and an end. For the sake of simplicity, imagine that the deque is initialized with a starting capacity of 10. Both the start and end variables could then be initialized to point at index 5. Any push to the front decreases the starting index, and any push to the back increases the ending index.

When accessing an individual element, the index is added to the starting index to determine the final array position to employ. If this sum exceeds the size of the array, it "wraps around" to the lower part of the array.

What is not obvious is the degree to which this extra addition to find the index represents a performance penalty. Documentation for the Rust Collections library states that whenever Vec and VecDeque have the same asymptotic time bounds for an operation, that the Vec will generally be faster, but no further details are provided. 

I became interested in this question because of how the constant-time double-ended push and pop operations for a VecDeque incur no asymptotic time penalty for direct random access. If the penalty is, in fact, insignificant, then maybe there is no need to use a regular Vec at all. So to satisfy my curiosity, I constructed an experiment

In this experiment, we create a Vec with a specified number of elements. (I used 100000 to produce the data below.) We then randomly shuffle these elements. At this point, the experiment begins. We copy these values into another Vec and then a VecDeque, and then sort them using the selection sort algorithm. While this is not a very good sorting algorithm in general, for this purpose it is ideal, as it requires a lot of random accesses that are widely spaced across the sequence being sorted. By using the same sequence of shuffled values for each data structure, the test should also be fair.

Here are the results. For each algorithm, we performed 30 trials with 100000 elements. Each trial featured a different permutation of those values.

 Data StructureMean completion time (ms)   Standard Deviation
 Vec       3238.4        205.1558                   
 VecDeque 4813.2 213.019

It is pretty clear, then, that for random access the Vec is consistently faster than the VecDeque. In exchange for constant-time front-end pushes and pops, the VecDeque takes about 50% longer to complete each random access. It seems to me to be well worth the cost when that functionality is pertinent, but when it is not the plain Vec is strongly preferable.

Please feel free to download and experiment with my source code if you are interested.

No comments:

Post a Comment