Created on 04-06-2016 01:33 AM
One of the largest hurdles we have heard about from the community and customers working with the original OpenSoc code base was that it was nearly impossible to get the application up and running. Hence, our engineering team collaborated with the community to provide a scripted automated install of Metron on AWS.
The install only requires the user’s AWS credentials, a set of ansible scripts/playbooks, and Ambari BluePrints / APIs and AWS APIs to deploy the full end to end Metron application. The below table summarizes the steps that occur during the automated install.
|Step 1||Spin up EC2 instances where HDP and Metron will be installed and deployed|
|Step 2||Spin up an AWS VPC|
|Step 3||Install Ambari Server and Agents via Ansible Scripts|
|Step 4||Using Ambari Blueprints and APIS, install 7 Node HDP 2.3 Cluster with the following Services: HDFS, YARN, Zookeeper, Storm, Hbase, and Kafka. The blueprint used to deploy the HDP cluster can be found here: Metron Small Cluster Ambari BluePrint|
|Step 5||Install 2 Node Elastic Search Cluster|
|Step 6||Installation and Starting of the following data source probes: BRO, Snort, PCAP probe, YAF (netflow). This entails the following:|
|Deployment of 5 Metron Storm Topologies:|
|Configuration of Kafka Topics and Hbase Tables|
|Step 9||Install mySQL to store GeoIP enrichment data. The mySQL DB will be populated with GeoIP information from Maxmind Geolite|
|Step 10||Installation of a Metron UI for the SOC Analyst and Investigator persona.|
The installer will take about 60-90 minutes to execute fully. However, it could vary drastically based on how AWS is feeling during the execution. After the installer finishes, the deployment architecture of the app will look like the following.
Another area of focus for Metron TP1 was to address the following challenges with the old OpenSoc Topology architecture which were:
Some key re-architecture and refactor work done in TP1 to address these challenges were the following:
In the Old OpenSoc Architecture, some key limitations were the following:
The below diagram illustrates the old architecture.
With the new Metron Architecture, the key changes are:
The below diagram illustrates the new architecture.
PCAP represents the most granular data collected in Metron consisting of individual packets and frames. Metron uses a DPDK which provides a set of libraries and drivers for fast packet collection and processing.
See the following for more details: Metron Packet Capture Probe Design
Netflow data represents rolled up PCAP data up to the flow/session level, a summary of the sequence of packets between two machines up to the layer 4 protocol. If one doesn’t want to ingest PCAP due to space constraints and load exerted on infrastructure, then netflow is recommended. Metron uses YAF (Yet Another Flowmeter) to generate IPFIX (Netflow) data from Metrons PCAP robe. Hence the output of the the YAF probe is IPFIX instead of the raw packets.
See the following for more details: Metron YAF Capture Design
Bro is an IDS (Intrusion Detection System) but Metron uses Bro primarily as a Deep Packet Inspection (DPI) metadata generator.The metadata consists of network activity details up to layer 7 which is application level protocol (DNS, HTTP, FTP, SSH, SSL). Extracting DPI Metadata (layer 7 visibility) is expensive, and thus, is performed only on selected protocols. Hence, the recommendation is to turn on DPI for HTTP and DNS Protocols. Hence, while the PCAP probe records every single packet it sees on the wire, the DPI metadata is extracted only for a subset of these packets. This metadata is one of the most valuable network data for analytics.
See the following for more details: Metron Bro Capture Design
Snort is a popular Network Intrusion Prevention System (NIPS). Snort monitors network traffic and produces alerts that are generated based on signatures from community rules. Metron plays the output of the packet capture probe to Snort and whenever Snort alerts are triggered
Metron uses Apache Flume to pipe these alerts to a Kafka topic.
See the following for more details: Metron Snort Capture Design
A common question is why we focused first on these initial set of network telemetry data sources. Keep in mind that the end vision of Apache Metron is to be an analytics platform. These 4 network telemetry data sources are some of the key data sources required for some of the next generation ML, MLP and statistical models that we are planning to build in future releases. The below table describes some of these models and the data input requirements.
|Analytics Pack||Analytics Pack Description||Telemetry Data Source|
|Domain Pack||A collection of Machine Learning models that identify anomalies for incoming and outgoing connections made to a specific domain that appear to be malicious|
|UEBA Pack||A collection of Machine Learning models that monitor assets and users known to belegitimate to identify anomalies from their normal behavior.|
|Relevancy/Correlation Engine Pack||A collection of Machine Learning models that identify alerts that are related within the massive volumes of alerts being processed by the cyber solutions.|
Protocol Anomaly Pack
|A collection of Machine Learning models that identifies if there anything unusual about network traffic monitored via deep packet inspection (PCAP)|
The system is configurable so that one can enable only the data sources of interest.
In future Metron tech previews, we will be adding support for these types of security data sources:
The below diagram illustrates the Enrichment framework that was built in Metron TP1. The key components of the framework are:
The specific enrichments supported in Metron TP1 is below.
|Description||Enrichment Source, Store, Loader Type, Refresh Rate||Metron Message Field Name that will Enriched|
|GeoIP||Tags on GeoIP (lat-lon coordinates + City/State/Country) to any external IP address. This can be applied both to alerts as well as metadata telemetries to be able to map them to a geo location.|
|Host||Enriches IP with Host details||dest_ip|
More details can be found here: Metron Enrichment Services
The Threat Intel framework is very similar to the Enrichment framework. See below architecture diagram.
The specific threat intel services supported in TP1 is below.
|Threat Feed||Feed Description||Feed Format||Refresh Rate|
|Soltra||Threat Intel Aggregator||Stix/Taxii||Poll every 5 minutes|
|Hail a Taxi||Repository of Open Source Cyber Threat Intellegence feeds in STIX format.|
|Poll every 5 minutes|
More details can be found here: Metron Threat Intel Services