Member since
08-31-2015
81
Posts
115
Kudos Received
17
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3013 | 03-22-2017 03:51 PM | |
1838 | 05-04-2016 09:34 AM | |
1429 | 03-24-2016 03:07 PM | |
1585 | 03-24-2016 02:54 PM | |
1511 | 03-24-2016 02:47 PM |
04-27-2016
09:22 AM
Good question @Matt McKnight. We will have support for Solr indexing services in Metron TP2 which is slated for end of May. However in TP2, we will still only support Metron UI that is based on Kibana (based on Elastic). This will change in subsequent reelases. So net net, by middle/end of May we will support Solr indexing but you would have to write the UI that calls the SOLR Apis for search queries. Farther down the line, we will provide a custom UI (away from Kibana) that uses SOLR to do search. Make sense?
... View more
04-13-2016
02:07 PM
Good feedback @Hakan Akansel. I updated the article to be more clear on where the event gets persisted.
... View more
04-13-2016
02:01 PM
2 Kudos
@nbalaji-elangovan. This error would indicate that you might not have built all the projects via maven, Can you make sure you ran mvn package -DskipTests from the incubator-metron-Metron_0.1BETA_rc7/metron-streaming directory.
... View more
04-13-2016
01:57 PM
@rmckissick Please send the full ansible.log file located in incubator-metron-Metron_0.1BETA_rc7/deployment/amazon-ec2. When you send the ansible.log please sanitize any ec2 instance names, you don't want to publish out those to the entire community.
... View more
04-12-2016
01:16 PM
I ran into the following error when following these instructions: 2016-04-12 05:42:59,328 p=2472 u=gvetticaden | fatal: [obfuscated_ip]: UNREACHABLE! => {"changed": false, "msg": "SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue", "unreachable": true} To fix this issue, see the following thread: https://community.hortonworks.com/questions/24344/aws-unreachable-error-when-executing-metron-instal.html
... View more
04-06-2016
01:33 AM
7 Kudos
Platform Theme Key Features Fully Automated Scripted Install of Metron on AWS One of the largest hurdles we have heard about from the community and customers working with the original OpenSoc code base was that it was nearly impossible to get the application up and running. Hence, our engineering team collaborated with the community to provide a scripted automated install of Metron on AWS. The install only requires the user’s AWS credentials, a set of ansible scripts/playbooks, and Ambari BluePrints / APIs and AWS APIs to deploy the full end to end Metron application. The below table summarizes the steps that occur during the automated install. Step Description Components Deployed Step 1 Spin up EC2 instances where HDP and Metron will be installed and deployed 10 m4.xlarge instances Step 2 Spin up an AWS VPC 1 AWS VPC Step 3 Install Ambari Server and Agents via Ansible Scripts Ambari Server 2.1.2.1 on master node Ambari Agents on slave nodes Step 4 Using Ambari Blueprints and APIS, install 7 Node HDP 2.3 Cluster with the following Services: HDFS, YARN, Zookeeper, Storm, Hbase, and Kafka. The blueprint used to deploy the HDP cluster can be found here: Metron Small Cluster Ambari BluePrint 7 Node HDP Cluster HDP Services: HDFS, YARN, Zookeeper, Storm, HBase & Kafka Step 5 Install 2 Node Elastic Search Cluster 2 Node ES 1.7 Cluster Step 6 Installation and Starting of the following data source probes: BRO, Snort, PCAP probe, YAF (netflow). This entails the following: Install and Start C++ PCAP Probe that captures PCAP data and pushed into Kafka Topic Install and Start YAF probe to capture netflow data Installation of BRO, Kafka Bro Plugin and starting these services Install and Start SNORT with community SNORT rules configured C++ PCAP Probe YAF/Netflow Probe BRO Server and Bro Kafka Plugin Snort Server Step 7 Deployment of 5 Metron Storm Topologies: 4 Parser Topologies for each Data Source supported (PCAP, Bro, YAF, SNORT) 1 Common Enrichment topology Install and Deployment of 5 Storm Topologies Step 8 Configuration of Kafka Topics and Hbase Tables Step 9 Install mySQL to store GeoIP enrichment data. The mySQL DB will be populated with GeoIP information from Maxmind Geolite Install of MySQL with GeoIP information Step 10 Installation of a Metron UI for the SOC Analyst and Investigator persona. Metron UI (Kibana Dashboard) Deployment Architecture After Install The installer will take about 60-90 minutes to execute fully. However, it could vary drastically based on how AWS is feeling during the execution. After the installer finishes, the deployment architecture of the app will look like the following. Metron Storm Topology Refactor / Re-Architecture Another area of focus for Metron TP1 was to address the following challenges with the old OpenSoc Topology architecture which were:
Code was extremely brittle Storm Topologies were designed without taking advantage of full parallelism Numerous“redundant” topologies Management of the app was difficult due to a number of complex topologies Very complex to add new Data Sources to the platform Very little unit and integration Testing Some key re-architecture and refactor work done in TP1 to address these challenges were the following:
Made the Metron code base simpler and easier to maintain by converting all Storm topologies to use flux configuration (declarative way to wire topologies together). Ability to to add new data source parsers without writing code using the Grok Framework parser. Enrichment, model and threat intel intel cross reference are now done in parallel as opposed to sequentially in the storm configuration Minimized the incremental costs of adding new topologies by having one common enrichment topology for all data sources All App configuration is stored in Zookeeper allowing one to manage app config at runtime without stopping the topology Improved code with new unit and integration test harness utilities Old OpenSoc Architecture In the Old OpenSoc Architecture, some key limitations were the following:
For every new data source, a new complex storm topology had to be added Each enrichment, threat intel reference and model execution was done sequentially No in-memory caching for enrichments or threat intel checks No Loader frameworks to load Enrichment or Threat Intel Stores The below diagram illustrates the old architecture. New Metron Architecture With the new Metron Architecture, the key changes are:
Adding a new data source means simply adding new normalizing/parser topology 1 common enrichment topology can be used for all data sources Using the Splitter/Joiner pattern, enrichments/models/threat intel execution is done in parallel Loader frameworks have been added to load the Enrichment and Threat Intel Stores Fast Cache has been added for enrichment and threat intel look ups The below diagram illustrates the new architecture. Telemetry Data Source Theme Key Features PCAP - Packet Capture PCAP represents the most granular data collected in Metron consisting of individual packets and frames. Metron uses a DPDK which provides a set of libraries and drivers for fast packet collection and processing. See the following for more details: Metron Packet Capture Probe Design YAF/Netflow Netflow data represents rolled up PCAP data up to the flow/session level, a summary of the sequence of packets between two machines up to the layer 4 protocol. If one doesn’t want to ingest PCAP due to space constraints and load exerted on infrastructure, then netflow is recommended. Metron uses YAF (Yet Another Flowmeter) to generate IPFIX (Netflow) data from Metrons PCAP robe. Hence the output of the the YAF probe is IPFIX instead of the raw packets. See the following for more details: Metron YAF Capture Design Bro Bro is an IDS (Intrusion Detection System) but Metron uses Bro primarily as a Deep Packet Inspection (DPI) metadata generator.The metadata consists of network activity details up to layer 7 which is application level protocol (DNS, HTTP, FTP, SSH, SSL). Extracting DPI Metadata (layer 7 visibility) is expensive, and thus, is performed only on selected protocols. Hence, the recommendation is to turn on DPI for HTTP and DNS Protocols. Hence, while the PCAP probe records every single packet it sees on the wire, the DPI metadata is extracted only for a subset of these packets. This metadata is one of the most valuable network data for analytics. See the following for more details: Metron Bro Capture Design Snort Snort is a popular Network Intrusion Prevention System (NIPS). Snort monitors network traffic and produces alerts that are generated based on signatures from community rules. Metron plays the output of the packet capture probe to Snort and whenever Snort alerts are triggered Metron uses Apache Flume to pipe these alerts to a Kafka topic. See the following for more details: Metron Snort Capture Design Why are these Network Telemetry Sources Important? A common question is why we focused first on these initial set of network telemetry data sources. Keep in mind that the end vision of Apache Metron is to be an analytics platform. These 4 network telemetry data sources are some of the key data sources required for some of the next generation ML, MLP and statistical models that we are planning to build in future releases. The below table describes some of these models and the data input requirements. Analytics Pack Analytics Pack Description Telemetry Data Source Required Domain Pack A collection of Machine Learning models that identify anomalies for incoming and outgoing connections made to a specific domain that appear to be malicious Bro UEBA Pack A collection of Machine Learning models that monitor assets and users known to belegitimate to identify anomalies from their normal behavior. Bro User Enrichment Asset Enrichment User Auth Logs Asset Inventory Logs Relevancy/Correlation Engine Pack A collection of Machine Learning models that identify alerts that are related within the massive volumes of alerts being processed by the cyber solutions. Snort Surracata Third Party Alerts Protocol Anomaly Pack A collection of Machine Learning models that identifies if there anything unusual about network traffic monitored via deep packet inspection (PCAP) PCAP YAF/Netflow Bro The system is configurable so that one can enable only the data sources of interest. In future Metron tech previews, we will be adding support for these types of security data sources:
FireEye Palo Alto Network Active Directory BlueCoat SourceFire Bit9 CarbonBlack Lancope Cisco ISE Real-time Data Processing Theme Key Features Enrichment Services The below diagram illustrates the Enrichment framework that was built in Metron TP1. The key components of the framework are:
Enrichment Loader Framework - A framework that bulk loads or polls data from an enrichment source. The framework supports plugging in any enrichment source Enrichment Store - The Store where all enrichment data is stored. HBase will be the primary store. The store will also provide services to de-dup and age data. Enrichment Bolt - A Storm Bolt that enriches metron telemetry events Enrichment Cache - Cache used by the bolt so that look ups to the enrichment store is cache The specific enrichments supported in Metron TP1 is below.
Enrichment Description Enrichment Source, Store, Loader Type, Refresh Rate Metron Message Field Name that will Enriched GeoIP Tags on GeoIP (lat-lon coordinates + City/State/Country) to any external IP address. This can be applied both to alerts as well as metadata telemetries to be able to map them to a geo location. Enrich Source: Maxmind Geolite Metron Store: MySQL (Will Use HBase in next TP) Loader Type: Bulk load from HDFS Refresh Rate: Every 3 months Src_ip, dest_ip Host Enriches IP with Host details Enrich Source: Enterprise Inventory/Asset Store Metron Store: HFDS Loader Type: Bulk load from HDFS dest_ip More details can be found here: Metron Enrichment Services Threat Intel Services The Threat Intel framework is very similar to the Enrichment framework. See below architecture diagram. The specific threat intel services supported in TP1 is below.
Threat Feed Feed Description Feed Format Refresh Rate Soltra Threat Intel Aggregator Stix/Taxii Poll every 5 minutes Hail a Taxi Repository of Open Source Cyber Threat Intellegence feeds in STIX format. Stix/Taxii Poll every 5 minutes More details can be found here: Metron Threat Intel Services
... View more
Labels:
04-06-2016
12:48 AM
9 Kudos
Metron TP1 Features The following are key capabilities available in Metron TP1 broken up across its four key functional themes. How do I get Started? You can spin up the Metron TP1 in two ways:
Ansible based Vagrant Single Node VM Install
This the best place to play with Metron First. Detailed instructions how to do the install can be found in the following HCC Article: Apache Metron TP 1 Install Instructions- Single Node Vagrant Deployment
Fully Automated 10 Node Ansible Based Install on AWS using Ambari Blueprints and AWS APIs
If you want a more realistic setup of the Metron app, use this approach. Keep in mind that this install will spin up 10 m4.xlarge EC2 instance by default Detailed instructions how to do the install can be found in the following HCC Article: Apache Metron - First Steps in the Cloud Where do I get Help? Hortonworks has created new Track called CyberSecurity in the Hortonworks Community Connection (HCC). The link to the this new track in HCC is the following: HCC CyberSecurity Track. Apache Metron committers are subscribed to this track and are constantly monitoring it for any questions the community has on TP1. When asking a question about Metron TP1, please select the “CyberSecurity” Track and add the following tags: “Metron” and “tech-preview”. Platform Theme Features of Metron TP1 The below is a summary of the key platform features added in TP1: Feature Related Apache Metron JIRAS Support for HDP 2.3 Refactor Metron Topologies for Performance, Easier Manageability & Supportability METRON-56 METRON-33 Fully Automated Install of Metron on AWS on multi-node HDP cluster via Ansible scripts, Ambari blueprints and APIs. METRON-59 METRON-77 METRON-76 METRON-69 METRON-63 METRON-61 METRON-43 METRON-2 Single Node Vagrant Support for Metron for Development METRON-21 Unit and Integration Testing Frameworks, Code Test Coverage METRON-82 METRON-58 METRON-37 METRON-28 Telemetry Data Source Theme Features of Metron TP1 Metron TP1 focus is network telemetry data sources as described below. They represent the most valuable granular data one can collect and perform next generation analytics on. The Key Data collection features for Metron TP1 are the following: Feature Related Apache Metron JIRAS PCAP Ingest Data Services - Performant C++ probe that captures network packet and streams them into Kafka and gets bulk loaded into Metron METRON-79 METRON-79 METRON-73 METRON-55 METRON-39 YAF/Netflow Ingest Data Services - Ingests netflow data into Metron METRON-67 METRON-60 Bro Ingest Data Services - Custom BRO plugin that pushes out DPI (Deep Packet Inspection) metadata into Metron METRON-25 METRON-73 METRON-64 Snort Ingest Data Services - Stream snort generated alerts via Flume into Metron METRON-57 Grok Framework - Ability to add new Data Sources to Metron without writing new Parsing Topologies. For each new data source, grok expression file can be provided to normalized into Metron Event. METRON-66 Real-time Data Processing Theme Features of Metron TP1 For this theme, the key features in Metron TP1 are the following: Feature Related Apache Metron JIRAS Enrichment Services - OOO support for GeoIP and Host enrichments, extensible framework to plug-in new enrichments, & management Utilities for Enrichment Data METRON-32 METRON-43 Threat Intel Services - Integration with Soltra (Threat Intel Aggregrator) and Hail a Taxii, management Utilities for Threat Intel (Streaming and Bulk Load, aging out of data) METRON-35 METRON-50 Alerting Services - Alerts can be fired via a snort event or intel threat feed hit Indexing Services - Support for indexing via ElasticSearch METRON-36 METRON-56 METRON-66 Storage Services - persisting all enrichment telemetry data in HDFS and or HBase METRON-62 METRON-22 UI Theme Features of Metron TP1 There was less focus on the UI Theme but Metron TP1 does provide the following new UI features: Feature Related Apache Metron JIRAS Metron Investigator IO Dashboard for the SOC Analyst and Investigator Personas built on top of Kibana METRON-72 METRON-77 METRON-81 Histogram Panels for each of the data sources (YAF, Bro, Snort, PCAP) METRON-60 METRON-52 PCAP panel allow you to search for and download PCAP files METRON-72 METRON-77 METRON-81 Ability to customize the Metron UI with different data sources and different panel types. METRON-72 METRON-77 METRON-81
... View more
Labels:
04-05-2016
11:04 PM
3 Kudos
Metron User Personas There are six user personas for Metron: Persona Name Description SOC Analyst Profile: Beginner, Junior-level analyst Tools Used: SIEM tools/dashboards, Security endpoint UIs, Email/Ticketing/Workflow Systems
Responsibilities: Monitor security SIEM tools, search/investigate breaches, malware, review alerts and determine to escalate as tickets or filter out, follow security playbooks, investigate script kiddie attacks. SOC Investigator Profile: More advanced SME in cybersecurity, Experienced security analyst, understands more advanced features of security tools, thorough understanding of networking and platform architecture (routers, switches, firewalls, security), Ability to dig through and understand various logs (Network, firewall, proxy, app, etc..)
Tools Used: SIEM/Security tools, Scripting languages, SQL, command line
Responsibilities: Investigate more complicated/escalated alerts, investigate breaches, Takes the necessary steps to remove/quarantine the malware, breach or infected system, hunter for malware attacks, investigate more complicated attacks like ADT (Advanced Persistent Threats) SOC Manager Profile: Experience managing teams, security practitioner that has moved into management.
Tools Used: Workflow Systems (e.g: Remedy, JIRA), Ticket/Alerting Systems
Responsibilities: Assigns Metron Cases to Analysts. Verifies “completed” metron cases. Forensic Investigator Profile: E-discovery experience with security background.
Tools Used: SIEM and e-discovery tools
Responsibilities: Collect evidence on breach/attack incident, prepare lawyer’s response to breach, Security Platform Operations Engineer Profile: Computer Science, developer, and/or Dev/Ops Background. Experience with Big Data technologies and supported distributed applications/systems
Tools Used: Security Tools (SIEM, endpoint solutions, UEBA solutions), provisioning, management and monitoring tooling, various programming languages, Big Data and distributing computing platforms.
Responsibilities: Helps vet different security tools before bringing them into the enterprise. Establishes best practices and reference architecture with respect to provisioning, management and use of the security tools/ configures the system with respect to deployment/monitoring/etc. Maintains the probes to collect data, enrichment services, loading enrichment data, managing threat feeds, etc..Provides care and feeding of one or more point security solutions. Does capacity planning, system maintenance and upgrades. Security Data Scientist Profile: Computer Science / Math Background, security domain experience, dig through as much data as available and looks for patterns and build models
Tools Used: Python (scikit learn, Python Notebook), R, Rstudio, SAS, Jupyter, Spark (SparkML)
Responsibilities: Work with security data performing data munging, visualization, plotting, exploration, feature engineering and generation, trains, evaluates and scores models Why Metron? SOC Analyst & Investigator Perspective The above diagram illustrates the key steps in a typical analyst/investigator workflow. For certain steps in this workflow, Apache Metron provides keys capabilities not found in traditional security tools: Looking through Alerts
Centralized Alerts Console - Having a centralized dashboard for alerts and the telemetry events associated with the alert across all security data sources in your enterprise is a powerful feature within Metron that prevents the Analyst from jumping from one console to another. Meta Alerts - The long term vision of Metron is to provide a suite of analytical models and packs including Alerts Relevancy Engine and Meta-Alerts. Meta Alerts are generated by groupings or analytics models and provide a mechanism to shield the end user from being inundated with 1000s of granular alerts. Alerts labeled with threat intel data - Viewing alerts labeled with threat intel from third party feeds allows the analyst to decipher more quickly which alerts are legitimate vs false positives. Collecting Contextual data
Fully enriched messages - Analyst spend a lot of time manually enriching the raw alerts or events. With Metron, analysts work with the fully enriched message. Single Pane of Glass UI - Single pane of glass that not only has all alerts across different security data sources but also the same view that provides the enriched data Centralized real-time search - All alerts and telemetry events are indexed in real-time. Hence, the analyst has immediate access to search for all events. All logs in one place - All events with the enrichments and labels are stored in a single repository. Investigate
Granular access to PCAP - After identifying a legitimate threat, more advanced SOC investigators want the ability to download the raw packet data that caused the alert. Metron provides this capability. Replay old PCAP against new signatures - Metron can be configured to store raw pcap data in Hadoop for a configurable period of time. This corpus of pcap data can then be replayed to test new analytical models and new signatures. Tag Behavior for modeling by data scientists Raw messages used as evidentiary store Asset inventory and User Identity as enrichment sources. Note that the above 3 steps in the analyst workflow make up approximately 70% of the time. Metron will drastically decrease the analyst workflow time spend because everything the SOC analyst needs to know is in a single place. Why Metron? Data Scientist Perspective The above diagram illustrates the key steps in a typical data science workflow. For certain steps in this workflow, Apache Metron provides key capabilities not found in traditional security tools: Finding the data
All my data is in the same place - One of the biggest challenges faced by security data scientists is to find the data required to train and evaluate the score models. Metron provides a single repository where the enterprise’s security telemetry data are stored. Data exposed through a variety of APIs - The Metron security vault/repository provides different engines to access and work with the data including SQL, scripting languages, in-memory, java, scala, key-value columnar, REST APIs, User Portals, etc.. Standard Access Control Policies - All data stored in the Metron security vault is secured via Apache Ranger through access policies at a file system level (HDFS) and at processing engine level (Spark, Hive, HBase, Solr, etc..) Cleaning the data Metron normalizes telemetry events - As discussed in the first blog where we traced an event being processed by the platform, Metron normalizes all telemetry data into at least a standard 7 tuple json structure allowing data scientists to find and correlate data together more easily. Partial schema validation on ingest - Metron framework will validate data on ingest and will filter out bad data automatically which is something that data scientists, traditionally, spend a lot time doing. Munging Data Automatic data enrichment - Typically data scientists have to manually enrich data to create and test features or have to work with the data/platform team to do so. With Metron, events are enriched in real-time as it comes in and the enriched event is stored in the Metron security vault. Automatic application of class labels - Different types of metadata (threat intel information, etc…) is tagged on to the event which allows the data scientists to create feature matrixes for models more easily. Massively parallel computation framework - All the cleaning and munging of the data is using distributed technologies that allows the processing of these high velocity/ large volumes to be performant and scalable. Visualizing Data Real-time search + UI - Metron indexes all events and alerts and provides UI dashboard to perform real-time search. Apache Zeppelin Dashboards - Out of the box Zeppelin dashboards will be available that can be used by SOC analysts. With Zeppelin you can share the dashboards, substitute variables, and can quickly change graph types. An example of a dashboard would be to show all HTTP calls that resulted in 404 errors, visualized as a bar graph ordered by the number of failures. Integration with Jupyter - Jupyter notebooks will be provided to data scientists for common tasks such as exploration, visualization, plotting, evaluating features, etc.. Note that the above 4 steps in the data science workflow make up approximately 80% of the time. Metron will drastically reduce the time from hypothesis to model for the data scientist. Apache Metron Core Functional Themes Now that we have understanding of Metron’s user personas, we will now describe the four core functional themes that Metron will focus on. As the community around Metron continues to group, new features and enhancements will be prioritized across these four themes. The 4 core functional themes are the following: Apache Metron Release 0.1 and its Target Personas and Themes Over the last 4 months, the community led by Hortonworks, has been hard at work on Apache Metron’s first release (Metron 0.1) Now that we have described the User Personas and core themes for Metron, the following depicts where the engineering focus has been for Metron 0.1. As the diagram above illustrates, the key focus areas for Metron 0.1 are the following:
The Platform theme was the primary focus.. Before we can focus on the UI and supporting more telemetry data sources, we need to ensure that the platform is rock hard. This means ensuring an easy way to provision this very complex app and refactor/re-architecture work to ensure code is simpler and easier to maintain, adding new data sources in a declarative manner, performance and extensible improvements and improving the quality of the code. The persona of focus is the Security Platform Engineer. Metron 0.1 offers dashboard views for the SOC Analyst and SOC investigator.
... View more
Labels:
04-05-2016
10:11 PM
14 Kudos
Apache Metron Explained Apache Metron is a cyber security application framework that provides organizations the ability to ingest, process and store diverse security data feeds at scale in order to detect cyber anomalies and enable organizations to rapidly respond to them. As the diagram above indicates, the Metron framework provides 4 key capabilities: Security Data Lake / Vault - Platform provides cost effective way to store enriched telemetry data for long periods of time. This data lake provides the corpus of data required to do feature engineering that powers discovery analytics and provides a mechanism to search and query for operational analytics. Pluggable Framework - Platform provides not only a rich set of parsers for common security data sources (pcap, netflow, bro, snort, fireye, sourcefire) but also provides a pluggable framework to add new custom parsers for new data sources, add new enrichment services to provide more contextual info to the raw streaming data, pluggable extensions for threat intel feeds, and the ability to customize the security dashboards. Security Application - Metron provides standard SIEM like capabilities (alerting, threat intel framework, agents to ingest data sources) but also has packet replay utilities, evidence store and hunting services commonly used by SOC analysts. Threat Intelligence Platform - Metron will provide next generation defense techniques that consists of using a class of anomaly detection and machine learning algorithms that can be applied in real-time as events are streaming in. Tracing the Flow of a Security Telemetry Event though Metron The below diagram depicts the logical components of the Metron Platform. The below subsection traces an event as it flows through these different logical components. Step 1a - Telemetry Ingest For most security telemetry data sources that uses transports and protocols like file, syslog, REST, HTTP, custom API, etc., Metron will use Apache Nifi to ingest data at the source. An example would be capturing data from a FireEye appliance with Nifi’s SysLog Processor. The raw Fireye event captured would look something like the following: Step 1b - Fast Telemetry Ingest For high volume network telemetry data like packet capture (PCAP), Netflow/YAF, and Bro/DPI, custom Metron probes will be available to ingest data directly from a network tap. An example would be capturing Bro data using the custom C++ Metron probe. The raw Bro event captured by the Bro probe would look something like the following:
Step 2 - Telemetry Ingest Buffer All raw events from each telemetry security data source captured by Apache Nifi or custom Metron probe will be pushed into its own Kafka topic. The arrival of a telemetry event into the ingest buffer marks the start of where the Metron processing begins. Step 3 - Process (Parse, Normalize, Validate and Tag) Each raw event will be parsed and normalized into a standardized flat JSON structure. Every event will be standardized into at least a 7-tuple JSON structure. This is done so the topology correlation engine further downstream can correlate messages from different topologies by these fields. The standard field names are as follows: ip_src_addr: layer 3 source IP ip_dst_addr: layer 3 dest IP ip_src_port: layer 4 source port ip_dst_port: layer 4 dest port protocol: layer 4 protocol timestamp (epoch) original_string: A human friendly string representation of the message At this step, one can also validate the raw event and tag it with additional metadata which will be used by downstream processing. After Step 3, the raw Bro event will look like the following:
Step 4 - Enrich Once the raw security telemetry event has been parsed and normalized, the next step is to enrich different data elements of the normalized event. Examples of enrichment are GEO where an external IP address is enriched with GeoIP information (lat/long coordinates + City/State/Country) or HOST enrichment where an IP gets enriched with Host details (e.g: IP corresponds to Host X which is part of a web server farm for an e-commerce application). After Step 4, the enriched Bro event will look something like the following:
Step 5 - Label After enrichment, the telemetry event goes through the labeling process. Actions done within this phase include threat intel cross reference checks where elements within the telemetry event can be used to do look ups against threat intel feed data sources like Soltra produced Stix/Taxii feeds or other threat intel aggregator services. These threat intel services will then “label” the telemetry event with threat intel metadata when a hit occurs. Other types of services include executing/scoring analytical models using model as a service pattern with the telemetry events that are flowing in (more details on Analytical Models/Packs and Model as Service patterns will be coming in upcoming blogs of this series). After step 5 assuming the bro telemetry event had a threat intel hit, the message would look something like the following:
Step 6 - Alert and Persist During this phase, certain telemetry events can initiate alerts. These types of telemetry events are then indexed in an alert index store. A telemetry event can spawn an alert triggered by a number of factors including: The event type - The raw telemetry event itself is an alert. For example, any event generated by Snort is an alert so it will automatically be indexed as an alert. Threat intel hit - If raw telemetry event has a threat intel hit, it will be marked as an alert. Also during this step, all enriched and labeled telemetry events are indexed and persisted in Hadoop for long term storage. The storage of these events in Hadoop produces a security data vault within the enterprise that enables next generation analytics to be performed. After step 6, the telemetry event is stored in HDFS and indexed in Elastic/Solr based on configuration. The persisted event in HDFS looks something like the following:
Step 7 - UI Portal and Data & Integration Services Steps 1 through 6 provide the mechanism to ingest, parse, normalize, enrich, label, index and store all security telemetry data across a diverse set of data sources in your enterprise into a single security data vault. This allows the Metron platform to provide a set of services for different types of security users to perform their jobs more effectively. Some of these services include: Real-time Search and Interactive Dashboards / Portals - Single Pane of glass for security operation analysts to view alerts and correlate alerts to the granular telemetry events that caused the alert. Data Modeling / Feature Engineering Services - Since the Metron framework normalizes and enriches the data and stores it into the security data lake (HDFS, Hbase) in standardized locations, then various analytical models can be provided by the platform. These models will have specifications for the feature matrix required, and hence, the process of feature engineering which is the most complex aspect of analytics becomes considerably simplified. Data Modeling services required for the feature matrix will be provided by tools such as Jupyter, IPython and Zeppelin. Integration and Extensibility Layers - One of the most powerful features of the Metron platform is the ability to customize it for your own needs/requirements which includes:
Ingesting new data sources Adding new parsers Adding new enrichment services Adding new Threat Intel feeds Building, deploying and executing new analytical models Integration with enterprise workflow engines Customizing the Security Dashboards and portals Recap You should now have a better understanding of the history of Apache Metron and the high level capabilities of the platform. The next blog in this series will walk you through the different types of users we envision for Apache Metron, the core functional themes, and what the Metron community has been focusing on for the last few months.
... View more
Labels:
04-05-2016
10:11 PM
6 Kudos
Hello from the Metron PM and Eng Team Today, the Hortonworks Metron product management and engineering team are kicking off a multi-part blog series on Apache Metron, the next gen security analytics application that Hortonworks is building working with the Apache Community. Over the course of the next few weeks, we will release a series of blogs that covers the following topics: Part 1 - Apache Metron Explained - Overview of Apache Metron and traces a security telemetry event as it flows through the platform. Part 2 - Apache Metron User Personas and Why Metron? - Who will be the different users of Apache Metron? What are the core functional themes? What has been the focus for the first release? We will address all 3 of these questions in this blog. Part 3 - Apache Metron Tech Preview 1 - Come and Get It We will walk you through what the Metron community has been working on for the last 3 months. By the end of this blog, you will have a good understanding of what is in Metron Tech Preview 1 and how to get it installed, deployed and building on top of it. Part 4: Apache Metron UI and Finding a Needle in the Haystack Use Case - We will walkthrough the Metron UI components and how SOC Analyst would use it for common Metron Use Cases. Part 5 - Deep Dive on Apache Metron Tech Preview 1 - We will double click on the major functional areas of Metron TP 1. Part 6 - Apache Metron Vision - With a solid understand of what TP1 consists of, this blog will provide a glimpse into the roadmap and vision for Apache Metron and what the project will look like by the end of 2016 focusing on the analytics work planned. Roots of Apache Metron To understand Apache Metron, we have to first start with the origins of the project which emerged from the Cisco Project called OpenSoc. The below diagram highlights some of the key events in the history of Apache Metron starting with Cisco OpenSoc. 2005 to 2008
The Problem - Cyber crime spiked significantly and a severe shortage of security talent arose. The first set of companies alerted to this issue are high profile banks and large organizations with interesting proprietary information to state sponsored agents. All of the best investigators and analysts were gobbled up by multinational banking and financial services firms, large hospitals, telcos, and defense contractors.
The Rise of a New Industry, the Managed SOC - Those who could not acquire security talent were still in need of a team. Cisco was sitting on a gold mine of security talent that they had accumulated over the years. Utilizing this talent, they produced a managed service offering around managed security operations centers. Post 2008
The Age of Big Data Changed Everything - The Age of Big Data arrived, bringing more streaming data, virtualized infrastructure, data centers emitting machine exhaust from VMs, and Bring Your Own Device programs. The amount of data exploded and so did the cost of the required tools like traditional SIEMs. These tools became cost prohibitive as they changed to data driven licensing structures. Cisco’s ability to operate the managed SOC with these tools was in jeopardy and security appliance vendors took control of the market. 2013
OpenSOC is Born and Hadoop Matures - Cisco decided to build a toolset of their own. They didn’t just want to replace these tools but they wanted to improve and modernize them, taking advantage of open source. Cisco released its managed SOC service to the community as Hadoop matured and Storm became available. It was a perfect combination of a use case need and technology. OpenSOC was the first project to take advantage of Storm, Hadoop, and Kafka, as well as migrate the legacy ways into a forward thinking future type paradigm. September 2013 thru April 2015
The Origins of Apache Metron - For about 24 months, a Cisco team, led by their chief data scientist James Sirota, with the help of a Hortonworks team, led by platform architect Sheetal Dolas, worked to create a next generation managed SOC service built on top of open source big data technologies. The Cisco OpenSOC managed SOC offering went into production for a number of customers in April of 2015. A short time after, Cisco made a couple of acquisitions that brought in third party technologies transforming OpenSOC into a closed source, hardware based version. October 2015
OpenSOC Chief Data Scientist Joins Hortonworks - James Sirota, the chief data scientist and lead of the Cisco OpenSOC initiative, leaves Cisco to join Hortonworks. Over the course of the next 4 months, James starts to build a rock star engineering team at Hortonworks with the focus of building an open-source CyberSecurity application. December 2015
Metron Accepted into Apache Incubation - Hortonworks, with the help and support of key Apache community partners, including ManTech, B23 and others, submit Metron (renamed from OpenSOC) as an Apache incubator project. In December of 2015, the project is accepted into Apache incubation. Hortonworks and the community innovate at impressive speeds to add new features to Apache Metron and harden the platform. The Metron team builds an extensible, open architecture to account for the variety of tools used in customer environments (thousands of firewalls, thousands of domains and a multitude of Intrusion Detection Systems). Metron’s open approach makes it much easier to tailor to the community’s use cases. April 2016
First official Release of Apache Metron 0.1 - After 4 months of hard work and rapid innovation by the Metron community, Apache Metron’s first release Metron 0.1 is cut.
Given Hortonworks proven commitment to the Apache Software Foundation process and our track record for creating and leading robust communities, we feel uniquely qualified to bring this important technology and its capabilities to the broader open source community. Without Hortonworks, the Apache Metron project would not exist today!
... View more
Labels: