Wednesday, 24 June 2015

Traversing LinkedList considered (somewhat) harmful

I compare traversal of LinkedList, ArrayList and an array. First things first, given the mechanics of linked lists, iterating with a for loop, e.g:
        for (int i = 0; i < N; i++) {
            linkedList.get(i));
        }
is expected to be comparatively slow since the list has to be scanned every time to get() to the i-th element. While sub-optimal, it may still happen in the wild, especially when a call returns a List and the client programmer is not aware of the actual implementation.
 The benchmark measures performance of "for-style" traversal of LinkedList, ArrayList and array as well as "iterator-style" traversal of LinkedList and ArrayList e.g:
        Iterator iter = list.iterator();
        while (iter.hasNext()) {
            iter.next();
        } 

The code is available here. The output from JMH is
Benchmark                           Mode  Cnt       Score      Error  Units        
ListsBenchmark.forArray            thrpt   10  423016.591 ± 2745.121  ops/s        
ListsBenchmark.forArrayList        thrpt   10  345255.508 ± 1367.974  ops/s        
ListsBenchmark.forLinkedList       thrpt   10    2559.053 ±   14.089  ops/s        
ListsBenchmark.iteratorArrayList   thrpt   10  299156.440 ± 8655.560  ops/s        
ListsBenchmark.iteratorLinkedList  thrpt   10  212684.595 ± 1036.017  ops/s        
The score is number of operations per second (so higher is better) where an operation is a single traversal of a list/array of size 1000

Comments
  • As expected, traversal of a LinkedList with a for loop performs badly - about n * n/2 on the average
  • Although ArrayList is backed by an array, it performs significantly worse then a plain array. 
  • Iterating over LinkedList is significantly worse then iterating over ArrayList. This is most likely due to the fact that to move from one node to the next in a LinkedList,  the next node if fetched from the heap and then the contained value is fetched from the heap again. In the case of the ArrayList the references to the values are loaded into CPU caches and so only one visit to the heap is needed

4 comments:

  1. Similar in C++, "Bjarne Stroustrup: Why you should avoid Linked Lists": https://www.youtube.com/watch?v=YQs6IC-vgmo

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. This was on Oracle 1.8.0_x. The benchmark code is linked from the post so you can run it for yourself.

    ReplyDelete