Stream/Lambda Function Performance

While Java 8 was released quite some time ago, and Java 9 is getting very close now, I still had no experience of the new Java 8 features due to the strict requirement of Java 7 compatibility. Since I recently changed job, I am free of this restriction as the new company is in full control of the JVM and already uses Java 8 features in many places.

Looking at a lambda function for the first time I was very confused. I was jut not able to understand what it all meant and how to figure out the types of all the variables. I currently have a lot more experience with it and I can use lambda functions and streams like it should. I do however still tend to use normal for loops, mainly because of the habit I think. I do realize that streams and lambda functions allow you to write code more compact, but that is not always a good thing. I have seen code with nothing but lambda functions which made it very hard to understand. Whether lambda functions should be used and in which cases is not something I will discuss here.

Whenever I do use a stream with a lambda function I’m always intrigued in how the performance of such a construction compares to that of a simple for loop. So I did the test and created a small benchmark where I compare a couple of use cases I have seen throughout the code.

The test consists of an array of 10,000,000 integers which will be processed by using a stream, a parallel stream or a for loop to get results. Three types of processing will be done:

  1. Creating a new list of the original objects.
  2. Creating a new list with new objects of the original objects.
  3. Filter out approximately half of the elements.

The first case is not a very useful one bit it gives us an idea of the raw performance of each of the approaches. The second case is actually used quite often in our code-base to create an immutable object of an existing one. But this could also be used to create a different kind of object based on the information of another. The third one is also more useful as this is something you will do more often than just copying all objects. Moreover, these three test cases allow us to use the map, filter and collect methods of the java stream.

To get some reliable results I ran all tests 100 times. Every time a new random generated list was used as input for all the test cases. The size was required to get some decent results but forced me to introduce explicit garbage collection to prevent it from running during the test and thus causing outliers of a couple of seconds.

Creating a new list of original objects

final List copy = array.stream()
    .collect(Collectors.toList());
final List copy = array.parallelStream()
    .collect(Collectors.toList());
final List copy = new ArrayList(array.size());
for (int j = 0; j < array.size(); ++j) {
    copy.add(array.get(j));
}

Copy Benchmark

The for loop is a clear winner, the stream and parallel stream are equally fast (or slow). This might be a bit weird at first as you would assume that running in parallel would give you some performance boost, but this is not the case because of the collection step as objects are collected in order, which is a sequential operation nullifying all possible gains from the parrallel map.

Creating a new list of new objects

final List<Integer> copy = array.stream()
    .map(n -> new Integer(n))
    .collect(Collectors.toList());
final List<Integer> copy = array.parallelStream()
    .map(n -> new Integer(n))
    .collect(Collectors.toList());
final List<Integer> copy = new ArrayList<Integer>(array.size());
for (int j = 0; j < array.size(); ++j) {
    copy.add(new Integer(array.get(j)));
}

New Benchmark

Just like in the previous test, the for loop wins here. Which is no surprise because the only extra cost of creating a new object is suffered by each approach. The comparison between the stream and parallel stream still applies here.

Filtering half of the objects.

final List<Integer> copy = array.stream()
    .filter(n -> n > (Integer.MAX_VALUE / 2))
    .collect(Collectors.toList());
final List<Integer> copy = array.parallelStream()
    .filter(n -> n > (Integer.MAX_VALUE / 2))
    .collect(Collectors.toList());
final List<Integer> copy = new ArrayList<Integer>(array.size());
for (int j = 0; j < array.size(); ++j) {
    if (array.get(j) > (Integer.MAX_VALUE / 2)) {
        copy.add(array.get(j));
    }
}

Filter Benchmark

This time we have a different result, the for loop is no longer the fastest and the parallel stream wins this one. This test is really one where parallelism can gain benefit you as no longer all elements needs to be added to the list, this means the bottleneck of the collection step is being slightly resolved and elements can really be compared in parallel.

These benchmarks were performed on my Mac Book Pro, running the same tests on a different machine with a different JVM could give you different results and should not be generalized in any way. Yet they do show an interesting aspect of the Java streams which makes me curious about the internal structure, which is something I will look into in the future. For now the conclusion is that the simple for loop is often still the best choice, but highly parallelizable operations can be done with a parallel stream to get better results.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.