As CPUs have more and more cores available, big performance gains can be achieved by having your application take advantage of running tasks in parallel. An increased level of parallelism however introduces more complexity. An intinsic problem when trying to run things in parallel is deciding what to run in parallel. The amount of tasks than can run completely in parallel are seldom and some type of synchronisation will be required most of the time, already decreasing the possible gain you can make.
Another problem you face is to determine how much you want to run in parallel. Do you want to start a thread for every single small task you can think of? Or do you want to have a couple of bigger threads? Since you know that creating threads comes with a cost, but can you avoid this cost by using a thread pool instead? There are so many questions to be answered when starting to work with parallelism, but today I will be focussed on determining how much threads you should have running given a certain capacity of the CPU.
We all know that with a hashmap, the performance depends on the fill ratio of the actual map. A hashmap that is too full would result in bad performance due to hash clashes and the whole lookup would result in a lineair lookup.
However, it is important to note that there are different ways to implement a hashmap. One consists of having a single map and if the location where to store the element (depending on the hash value) is already occupied, you select the next available one. Another keeps a list of all elements that are present on a single location. This means that even if elements have the same hash value, there is no expensive procedure of finding the next free location and the same holds for the lookup.
But how exactly does the load factor influence the performance of the hashmap with the Java implementation? In the previous blog post I already did examine how the load capacity comes into play, this time we will extend it to also use the load capacity.
Related to the previous blog post, where I examined the effect of specifying the initial capacity of an ArrayList, I will now do the same for a HashMap. Compared to an ArrayList a HashMaps works different and it takes an additional parameter, the load factor. This parameter is required as a HashMaps performance will depend on how full it is. Because of the difference in adding elements to a HashMap and retrieving them again I decided to split this up into two separate blog posts. In this part I will examine adding elements to a HashMap.
When creating an ArrayList you have the option to specify the initial capacity. Most of the time however you don’t really know how big the list should be. If you would know the exact size of the list then it is often better to go for a regular array as this is much more memory efficient. For this reason I often just don’t specify an initial capacity, even though I know that this will lead to costly copying of the array when the list becomes too small. But what do you have to do when you don’t know the size? I guess you can always estimate it, and make some kind of guess. To get a better understanding of how the initial capacity affects throughput, I did some tests with it.
Recently I took a CUDA course and one of the things they mentioned to keep an out for was the usage of double precision. Double precision operations are slower and the added precision wasn’t worth it, so they say. This made me wonder whether this is also true for regular programming with languages such as C++ and Java. Especially because both of these languages have a double precision as the default floating point number. So how bad is it to use doubles instead of regular float, and what about that precision. I have investigated this in Java.
A Map is a very common data structure to use, and often you find yourself wanting to traverse over all elements in it. For this we have multiple methods:
If you are interested in both the keys and the values, the valueSet does not fit your needs as it is impossible to get the key based on the value. But which one of the remaining two ways performs the best? I did the test and found some interesting results.
When it comes to writing high performance software there are different approaches we can take. These different paths, match with how important performance is considered to be.
The first does not consider performance as anything important, the program is written without any attention towards how it performs. Only after users start complaining about the performance, actions are taken to fix it. Users will only complain if it is a killer for them, it isn’t tolerable and thus should be considered a bug.
Another way could be to write software the same as before, but checking the performance from time to time. When the performance of the software falls below some level, action must be taken. This prevents users from complaining, as the performance is boosted before it becomes critical.
A final way is that performance is taken into account when writing software, and is part of the design. This means constant monitoring of the performance, evaluating the choices that are made and making sure it stays above the required level. This level of performance that is desired is often higher than in the previous case.
But what are the consequences of each of these approaches? Is there a single best way to tackle performance?