Christian Walter ist Geschäftsführer und Redaktionsleiter von swiss made software. Bis Ende 2010 arbeitete er als Fachjournalist für das ICT-Magazin Netzwoche, publizierte zuletzt aber auch im Swiss IT Magazin, der Computerworld sowie inside-it.
Real-time monitoring of large data volumes is becoming increasingly important.
This also applies to SIX, where a new tool has made it possible to monitor seven million payments every day in real time.
All payments between Swiss financial institutions have been processed using the Interbank Clearing System (SIC) since 1987. This is operated by the SIX International Clearing Group, under the supervision of the Swiss National Bank. Despite its steady evolution, the system has been pushing close to its limits for a while and is now being updated. This is also in part to accommodate regulatory changes. The European harmonization of payments in connection with SEPA only adds to the update’s importance.
SIC4 is the name of this large project that has been going on for a few years. It is subdivided into different components, such as booking, transmission, and monitoring. The latter is being implemented by Bern-based mimacom. The aim was to create a system that can monitor seven million payments every day in real time, combined with good user-friendliness. Previously, problems were not automatically reported. Rather, they had to be identified either through a targeted manual search or in response to external information provided by telephone. The users were also only able to communicate with the system using special transaction applications, which required expert knowledge.
Solving a problem was almost as difficult as identifying it, as there was often a lack of context. For example, once a payment backlog had been identified, it was not possible to automatically identify which bank was involved, as transactions were not stored with real names, but numbers. The details attached to that identification number were stored together with the contact details in a separate system. The user had to switch back and forth between different systems in order to get a complete picture of the situation. “Systems that grow over time eventually become more trouble than they are worth. That is when it is time for something new,” says Agim Emruli, mimacom. The new system neeeded to allow end-to-end mapping of processes and seamless collaboration. And then there was SIX’s core requirement: that any event should be visible within just two seconds.
Just like Netflix
SIC4 is going far beyond monitoring: it will replace an entire system. mimacom therefore had the privilege of building something almost entirely new from scratch. The new system is based on Elasticsearch – an open source product. This solution has only been commercially available since 2012, but also has a long history in universities under the name “Apache Lucene”. In recent years it has appeared in an increasing number of places where large data volumes have to be monitored in real time. One prominent client is Netflix, which has more than 700 servers running in parallel to support real-time log analysis. “The system is very scalable,” Emruli confirms.
This is not an ‘out of the box’ solution. Rather, Elasticsearch is just the motor that powers a new vehicle: in order to run properly, other components are required. Bodywork, brakes, and steering wheels, for example. Aside from the actual integration into the peripheral systems, usability plays a huge part. The goal was not just to be able to access data in real time, but also be able to visualize these in a clear and appealing way. The data is presented in a freely configurable dashboard with a traffic light system. The users decide for themselves which values they wish to monitor and which limits will set off an alarm. Far more variables come into play here than just the number of payments.
1,700 events every second
Every payment consists of multiple log events, which amount to around 50 million per day. These log events represent the individual stages in the payment process (message input, verification, booking of payment, message output, etc.). Depending on where the problem occurs, different strategies can be applied to solve the problem – even though the immediate consequences are very similar. Problems in payments often manifest themselves in the form of backlogs, i.e. a number of uncompleted payments. If there is a snag at some point, this number can quickly explode. No surprise with an average of 1,700 log events every second! This is why it is important to be able to immediately see where the payments are getting stuck.
But even identifying the relevant events is just the first step. Numerous other variables later come into play. This also comes as no surprise, given that there are literally hundreds of different banking systems involved. Is it down to the infrastructure, perhaps? Is there a geographically isolated network outage that has taken down part of the system along with it? Are banks sending incorrect payment data? Is the data completely wrong, or perhaps only in part? The former would indicate a failed update, whereas the latter makes a different type of transmission problem more probable.
These values can be freely defined using the dashboard. The user can then click through KPI readouts to look deeper into the system. This, too, is a key element of the new system. The monitoring is not static, and access is not limited to the predefined KPIs. On the contrary, these are just starting points. The new system therefore has an exploratory element, which has only existed as a standard solution for just a few years. “Monitoring solutions used to think in predefined patterns. A red light meant that a previously defined problem had been identified. One had to disregard the fact that interaction between complex systems can also produce unforeseeable problems. With our system, the user can go searching through unstructured data,” Emruli explains.
Agile, with a clear focus on the objective
This project was carried out using agile development methods. “SIX knew exactly what they wanted. But we had a lot of freedom in how we went about achieving that objective,” says Emruli. “The core requirement, i.e. being able to show every event within two seconds, was the focus of the first sprint. There was a new release every two weeks. After six sprints, we had a minimum viable product.” This strategy made it possible to easily identify whether the core requirement could be successfully implemented. Feedback was also obtained from the business side with each additional sprint to ensure that the project was on the right track and identify where things might need adjustment.
This is how mimacom managed to build an entirely new monitoring system in twelve weeks – a system that fits seamlessly into a large project, in which seven other teams in addition to mimacom were involved. The first components of the new payment system went live in April 2015 in the form of euro payments. Payments in Swiss franc are set to follow at the start of 2016.