Building A Hadoop Cluster With Docker – Part 1

One of the things I started in 2017 and wanted to get more knowledge about in 2018 was Hadoop. While it is possible and very easy to run Hadoop on a single machine, this is not what it is designed for. Using it in such a way would even make your program run slower because of the overhead that is involved. So since Hadoop is designed for big data, and distributes that data over a cluster of machines it makes sense to also run it like this. While it is possible to have many nodes running on the same machine, you would need to setup quite a lot of configuration to prevent port collisions and such.  On the other hand I don’t have a cluster available to do some experimenting with. Although I know it is possible to get some free tier machines at cloud providers such as Amazon, this is not something I want to do for a minor experiment like this. Instead I have chosen to create a virtual cluster by using Docker.

Continue reading

Advertisements

Float vs Double

Recently I took a CUDA course and one of the things they mentioned to keep an out for was the usage of double precision. Double precision operations are slower and the added precision wasn’t worth it, so they say. This made me wonder whether this is also true for regular programming with languages such as C++ and Java. Especially because both of these languages have a double precision as the default floating point number. So how bad is it to use doubles instead of regular float, and what about that precision. I have investigated this in Java.

Continue reading

Java Interface Default Implementations

An interface is often just an implicit definition as every class defines an Interface. An interface just defines the contract of the class by specifying the methods that can be invoked. Java however also has an explicit interface type which until recently adhered to this idea, but since Java 8 an interface can have a ‘default’ implementation. This idea breaks completely with what an interface is supposed to be and fades the line between an interface and an abstract class. So what is still the point of defining an interface and using default implementations?

Continue reading

Web vs Native Applications

We are looking into replacing some operations tool and we have been questioning whether this should be a web application (as it is now) or a native application. For this I did some investigation and made a clear comparison between the advantages and disadvantages of each solution. The application we have could actually be split up in two parts, one for the operations and a more advanced for diagnostics. This was key in my investigation as both serve a different audience.

My analysis is of course focused on this purpose and on the situation in which this application has to run. This may not be applicable for your situation, but I hope that my explanation and arguments are clear enough so you can filter out which ones are relevant for you.

Continue reading

C++ Unit Testing

In the previous post, I have been examining the C++ build tools that are available. As a good Software Engineer, the next step of course is to choose a testing framework. From my experience with Java, I have a couple of requirements on what the testing framework should allow. Most of these requirements come from using jUnit and Mockito, which I think are both outstanding frameworks that allow you to easily test your code.

For my tests I examine the following libraries: gTest, CppUTest, Boost, CxxTest and FakeIt. Some important features I am looking for is easy testing/mocking, readable tests, and the limitations of the framework.

Continue reading

C++ Build Process

I remember my time at the university, when working on C++ code, all of the command line commands I executed to get the my to compile and link. For the small projects I did back then, that was more than enough. We did get some small introduction into make, but we didn’t go into detail about how it all worked. When we started to work with Java, we had to use Maven, and again we didn’t get any real introduction. In my short career I have discovered the world of Maven and I am now more aware of how it all works, making me proficient enough to do what I need to manage bigger projects.

My love for C++ however never really faded, and I have been looking into doing some more C++ coding again. To get me started I was investigating build process tools for C++, in my search I found a couple of solutions: CMake, Maven and Gradle. The goal I set out for all of these tools, was to get a simple project and some tests for it to build.

Continue reading