25. April 2014 by Markus
This new release comes with some long expected ergonomic improvements but most effort was spent on making the core processing engine even more reliable and efficient.
As has already been highlighted in previous posts, Retrospective is not trying to compete with the fully blown SIEM (Security Information and Event Management) products. SIEMs are of course powerful but their provisioning, configuration and maintenance is tedious and time consuming. Retrospective can be configured in minutes and no special agent-like software has to be installed on servers where log files are to be processed. Moreover, besides avoiding any agent-like components, Retrospective guarantees that not a single file (even in the user home directory or /tmp) will be modified on the server. Therefore, we can say with no hesitation that Retrospective is a very lightweight solution, which allows the user to peek into remote servers in an entirely transparent manner.
So we know that Retrospective is definitely lightweight, but what about its power? For sure providing some powerful features with the explained limiting constraints is not easy. Let’s go over the technical challenges which had to be overcome:
Retrospective logic related to dividing data to log entries, parsing date/time information and applying filters, should be performed on the server side because: 1. Transferring log files to Retrospective would consume too much time and bandwidth (especially when connecting to servers over Internet). 2. It is more efficient to process log files with the use of the available server side resources
No special tooling should be required on the server side. Only the flagship *NIX tools such as grep and awk can be used. Additionally, the following should be noted. 1. Flagship *UNIX tools on different operating systems such as Linux, Solaris, AIX, HP-UX, Freebsd or iMac could work slightly differently in terms of both functionality and performance. For example, grep on FreeBSD is slow as hell :P. 2. To invoke these tools, there is a need for scripting logic. Different possible shells (sh, dash, bash, ksh, zsh, csh) have different features and the script syntax sometimes differs quite significantly.
No temporary state can be written to a file on the server side. Everything has to be placed in a pipeline which processes the data, as it goes, in the most functional and efficient manner possible.
In order to overcome the above-mentioned challenges, we have come up with the following architectural setup which presents the logic related to searching.
As can be seen, processing of log data is divided into separate thread workers and each worker is concerned with a single file. Processing of local log data is different than the server side processing, which is performed by a specially crafted SSH command pipeline implemented by means of advanced POSIX shell scripting. The whole approach guarantees the following benefits:
Resources (CPU, memory) of the node on which Retrospective is launched are not consumed.
Different OSes, shells and tools flavors (POSIX, GNU, Solaris, other) are supported.
When the file is searched with given search criteria, then only the data matching the criteria is actually transferred through the network to Retrospective.
The above is also true for date/time filtering. In this case, a sophisticated polymorphic script executed as SSH commands is able to interpret and correctly filter dates in all possible formats and locales.
In many cases, Retrospective scripts used for searching are faster than regular *NIX tools (awk, grep) thanks to the usage of special optimizations.
In cases of profiles with many data sources and files, thanks to a parallel file processing, Retrospective could be compared to multiple instances of grep and awk tools executed simultaneously. This results in a significant performance boost.
By assuming an adaptive approach, Retrospective exploits the resources of the remote server in an optimal manner. If servers respond quickly, then more simultaneous search/monitor SSH commands are allowed. If servers respond slowly, then the amount of simultaneous SSH commands is reduced. In the end, we get the results only as quickly as it is possible.
By facing the challenges with a solid piece of software engineering we have ensured that the latest Retrospective release is still lightweight but a definitively more powerful tool, whose functionality and performance could easily compete with some of the sophisticated SIEM components.