Iterating A Map

A Map is a very common data structure to use, and often you find yourself wanting to traverse over all elements in it. For this we have multiple methods:

  • keyset()
  • values()
  • entrySet()

If you are interested in both the keys and the values, the valueSet does not fit your needs as it is impossible to get the key based on the value. But which one of the remaining two ways performs the best? I did the test and found some interesting results.

The test simply iterates over all elements in the map (String to Integer) and gets both the key and the value of it. When using the keySet() method, an extra get(key) is performed to get the actual value. For the entrySet(), both the key and the value are available in the entry. The code snippets of both approaches are shown below.

for (final String str : strToInt.keySet()) {
final String key = str;
final Integer value = strToInt.get(str);
}

for (final Entry entry : strToInt.entrySet()) {
final String key = entry.getKey();
final Integer value = entry.getValue();
}

This test was executed with an increasing size of the map, starting from 8 and ending at 1048576 with steps that double the size. For every size iterating over the map was done 500 times, for which an average duration was taken. This test was performed for a hashmap at 100% occupation, a hashmap at 50% occupation and a treemap. The results are shown in the graph below, note that both axes are shown in logarithmic scale.

MapCompare

Up until around 1000 elements, there is a lot of noise and results vary quite a lot. All values are very close to each other and in the margin of error. As with many things it doesn’t really matter what you do if it’s only for a small number of elements, as the size begins to increase however we can see some very clear differences. The detailed results of the bigger sizes are shown in the table and graph below.

HashMap Full HashMap Halve TreeMap
Size KeySet (ms) EntrySet (ms) KeySet (ms) EntrySet (ms) KeySet (ms) EntrySet (ms)
262,144 7.370 4.960 7.234 5.036 52.088 8.474
524,288 13.856 10.364 11.738 8.300 103.840 11.624
1,048,576 30.626 22.344 24.718 18.184 221.996 34.904

MapZoom

For all different maps, the entry set scores the best although the difference in case of a hashmap is neglectable. This is obviously caused by the fact that a lookup in a hashmap is O(1), and thus adds only a constant extra lookup time. The amount of hash clashes may however play an important role, as can be seen when comparing the full and halve hashmap. It is however a bit weird that both the keyset and the entryset take a hit because of this. As could be predicted the treemap is much slower as it traverses the elements in order. The cost of an additional lookup is because of this also much higher than with a hashmap.

Even though it is acceptable to use the keyset approach when using a hashmap, it is clearly better to always use the entrymap if you need both the key and value of the elements. This will also prevent you from taking a huge performance hit if you ever change the map implementation.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s