Recently we had the problem that some application that was running at the customer was not able to talk to multiple instances of our application. There were a lot of reasons why we wanted a separate instance of our application, even though we could have combined them into one. The main reasons was the actual physical differences between the things they had to manage. But changing the software of the customer would be too costly, so instead we investigated whether it would be possible to introduce a kind of router to handle this.
While this is definitely not the best solution, as it is a bit of a hack, and we did lose some functionality, and it will be more fragile and less robust. But this are all things we can live with, which makes this option the best one we have.
The idea behind the application is the same as a router, based on some certain criteria, a message either needs to be delivered to one or the other instance of the application. The tricky part is that the communication uses a custom protocol, which means that the router has to use the same interface as well. Moreover, since we don’t want to make any changes to the customer his software. The router therefor takes the place of our application with all the identical settings for the communication. The router on its turn communicates with both instances using that protocol, but with some different settings (such as ports, etc…).
To identify which message is sent to which instance, the router has to understand the message content and therefor parse the message itself. Only after seeing the content it will forward the message to the correct instance based on a field in the message. All messages from both instances can be passed on to the customer’s software.
A problem arises as there are special messages requesting or reporting the status of the software. A request for status, should be sent to both of the instances, but a status report can not simply be forwarded to the customer’s software as it will be confusing for the software as it receives two of them, with different data in it. Luckily, these messages weren’t used by this customer, which allowed me to simply disregard those type of messages.
A final remark about this is that it will increase latency. Moreover to guarantee the same level of robustness as the communication protocol foresees (by using ACKs and NACKs), we have to fiddle with it. Our router is not allowed to acknowledge a message it gets, not even after it has passed it on. It is critical that the router ONLY acknowledges the message after it received an acknowledgement itself. This, combined with the library means we are forced to used synchronous communication.
After some testing however, it was discovered that this did not cause too much problems as the library created different threads for different messages. If this were not the case, we would increase latency by a lot. Finally a test was done to see how the whole would act if one of the two instances went down. This as well was handled nicely, messages meant for the up instance were delivered, whereas those for the down one, were rejected. Even if the router goes offline, no messages are lost due to the fact that we never acknowledged them to the customer’s software. He will just re-send them, and as soon as we can deliver them and acknowledge it, everything is fine.