Created on 04-09-201604:02 AM - edited 08-17-201912:52 PM
Apache Metron vs. OpenSoc
Apache Metron inherits the advantages of OpenSoc which
enables fast processing of events from variety sources. One of its intent is to
overcome the shortcomings of OpenSoc. The main challenges of OpenSoc
Does not take advantage of full parallelism. For
each topic, the enrichment topology contains a set of bolts which run in
serially. This architecture does not take advantage of the parallelism that
storm could provide. It can be observed in OpenSoc architecture in Figure 1.
Hard to extend. To add a new data source, it has
to create a new topology to fulfill that requirement.
Hard to maintain. With the number of data sources
increases, there are lots of “redundant” topologies. Many logics of bolts in enrichment
topology are similar, however, due to the architecture of OpenSoc, each topic
has to maintain its own line of code. If any change is needed for a certain shared
logic for topics, it has to be modified across multiple lines of code, which increase
the cost and complexity of maintenance.
Lack of testing. There is no unit and
integration testing within the OpenSoc. The quality of code has not been fully
With the new Metron architecture, as showed in Figure 2, the
intent of Metron is to achieve better extensibility, better maintainability, and
Metron has the ability to add new data source
parsers without writing code, this is achieved by using Grok Framework
Parser. With the Grok Framework Parser, Metron can add pattern file that
abstracting the new data sources, and thus easily be extended to accept new data
introducing additional layer of normalizing topology, as showed in Figure 2, Metron
abstract the common feature of ingesting data source, and thus remove the necessities
to implement individual topology for each data source.
converts all Storm topologies to use Flux configuration (declarative way to
wire topologies together).
Introduces unit and integration testing frameworks will reduce the cost of maintenance,
improve the quality of code as well as errors in runtime.
moving most configuration information into Zookeeper, Metron remove the necessities
to stop the topologies in most cases. In the case of changing configuration of
topologies, there is no need to restart the topology which will not impact the
showed in figure 2, by introducing splitter joiner pattern of Storm, Metron is
able to process multiple enrichment bolt and intel threat cross reference bolt
in parallel, which is expected to improve the performance.
introducing Cache mechanism, Metron is able to maintain the most used cross
reference information of enrichment and intel threat in memory, which in turn
to reduce to cost to access data from HBase and MySQL.
The new architecture that Metron adopted will significantly
reduce the cost for adding new sources, while maintaining high performance that
OpenSoc has. With the support of open source community, Metron is evolving quickly
and is adding more features in fast pace.