Member since
08-31-2015
81
Posts
115
Kudos Received
17
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1195 | 03-22-2017 03:51 PM | |
624 | 05-04-2016 09:34 AM | |
785 | 03-24-2016 03:07 PM | |
749 | 03-24-2016 02:54 PM | |
745 | 03-24-2016 02:47 PM |
03-22-2017
04:19 PM
@Satish Duggana can you please help answer this questions
... View more
03-22-2017
04:17 PM
@mqureshi, Yes, this is a known issue. When you create a SAM app, you cannot have spaces. This is covered in the docs but we will add validation to the UI so the user is not allowed to do this. @Sriharsha Chintalapani can you please create a bug for this and send me the Jira link...
... View more
03-22-2017
03:51 PM
One other tip. If you want to see what jars/classes are being used for each of the processors in SAM. Select Settings --> Component Definition Select Edit under Actions for the processor you are interested in. You will then see the details of the processor...
... View more
03-22-2017
03:28 PM
1 Kudo
Hi @Eric Brosch SAM (formerly known as StreamLine) uses Storm PMML Bolt Storm integration (https://github.com/apache/storm/tree/d5acec9e3b9473a0e8cf39c7e12393626a3ca426/external/storm-pmml) which uses JPMML evaluator (https://github.com/jpmml/jpmml) @Sriharsha Chintalapani
... View more
03-20-2017
07:47 PM
So me see I understand:
You are running the simulator and data is being generated
You have nifi up and running and it is ingesting the data being simulator and pushing that data into a Kafka topic You have verified in Ambari/Grafana and using kafka-consumer that data is actually being sent to the kafka topic successfully? Please verify by sending a screenshot. You have SAM app that is deployed and it is supposed to read from the kafka topic. But you are not seeing any events being read form the Kafka spout? Please send screenshot of the SAM running app and the the ambari storm view of tthe topology that is being deployed Questions:
How are you verifying that SAM is not picking up any tuples? Did you verify that the topology was deployed successfully without errors in Ambari using the new Storm Ambari View?
... View more
01-16-2017
03:23 PM
@Satish Duggana --> Thoughts?
... View more
05-05-2016
11:48 PM
2 Kudos
In previous article of the sereies, Enriching Telemetry Events, we walked through how to enrich a domain element of a given telemetry event with WhoIs data like home country, company associated with domain, etc. In this article, we will enrich with a special type of data called threat intel feeds. When a given telemetry event matches data in a threat Intel feed, an alert is generated. Again, the customers requirement are the following:
The proxy events from Squid logs needs to ingested in real-time. The proxy logs has to be parsed into a standardized JSON structure that Metron can understand. In real-time, the squid proxy event needs to be enriched so that the domain named are enriched with the IP information In real-time, the IP with in the proxy event must be checked against for threat intel feeds. If there is a threat intel hit, an alert needs to be raised. The end user must be able to see the new telemetry events and the alerts from the new data source. All of this requirements will need to be implemented easily without writing any new java code. In this article, we will walk you through how to do 4 and 5. Threat Intel Framework Explained Metron currently provides an extensible framework to plug in threat intel sources. Each threat intel source has two components: an enrichment data source and and enrichment bolt. The threat intelligence feeds are bulk loaded and streamed into a threat intelligence store similarly to how the enrichment feeds are loaded. The keys are loaded in a key-value format. The key is the indicator and the value is the JSON formatted description of what the indicator is. It is recommended to use a threat feed aggregator such as Soltra to dedup and normalize the feeds via Stix/Taxii. Metron provides an adapter that is able to read Soltra-produced Stix/Taxii feeds and stream them into Hbase, which is the data store of choice to back high speed threat intel lookups of Metron. Metron additionally provides a flat file and Stix bulk loader that can normalize, dedup, and bulk load or stream threat intel data into Hbase even without the use of a threat feed aggregator. The below diagram illustrates the architecture: Step 1: Threat Intel Feed Source Metron is designed to work with Stix/Taxii threat feeds, but can also be bulk loaded with threat data from a CSV file. In this example we will explore the CSV example. The same loader framework that is used for enrichment here is used for threat intelligence. Similarly to enrichments we need to setup a data.csv file, the extractor config JSON and the enrichment config JSON. For this example we will be using a Zeus malware tracker list located here: https://zeustracker.abuse.ch/blocklist.php?download=domainblocklist. Copy the data form the above link into a file called domainblocklist.txt on your VM. Run the following command to parse the above file to a csv file called domainblocklist.csv cat domainblocklist.txt | grep -v "^#" | grep -v "^$" | grep -v "^https" | awk '{print $1",abuse.ch”}' > domainblocklist.csv Now that we have the "Threat Intel Feed Source" , we need to now configure an extractor config file that describes the the source. Create a file called extractor_config_temp.json and put the following contents in it. {
"config" : {
"columns" : {
"domain" : 0
,"source" : 1
}
,"indicator_column" : "domain"
,"type" : "zeusList"
,"separator" : ","
}
,"extractor" : "CSV"
}
Run the following to remove the non-ascii characters we run the following: iconv -c -f utf-8 -t ascii extractor_config_temp.json -o extractor_config.json
Step 2: Configure Element to Threat Intel Feed Mapping We now have to configure what element of a tuple and what threat intel feed to cross-reference with.This configuration will be stored in zookeeper. The config looks like the following: {
"zkQuorum" : "node1:2181"
,"sensorToFieldList" : {
"bro" : {
"type" : "THREAT_INTEL"
,"fieldToEnrichmentTypes" : {
"url" : [ "zeusList" ]
}
}
}
}
Cut and paste this file into a file called "enrichment_config_temp.json" on the virtual machine. Because copying and pasting from this blog will include some non-ascii invisible characters, to strip them out please run iconv -c -f utf-8 -t ascii enrichment_config_temp.json -o enrichment_config.json
iconv -c -f utf-8 -t ascii enrichment_config_temp.json -o enrichment_config.json Step 3: Run the Threat Intel Loader Now that we have the threat intel source and threat intel config defined, we can now run the loader to move the data from the threat intel source to the Metron threat intel Store and store the enrichment config in zookeeper. /usr/metron/0.1BETA/bin/flatfile_loader.sh -n enrichment_config.json -i abuse.csv -t threatintel -c t -e extractor_config.json
After this, the threat intel data will be loaded in Hbase and a Zookeeper mapping will be established. The data will be populated into Hbase table called threatintel. To verify that the logs were properly ingested into Hbase run the following command: hbase shell
scan 'threatintel'
You should see the table bulk loaded with data from the CSV file. Now check if Zookeeper enrichment tag was properly populated: /usr/metron/0.1BETA/bin/zk_load_configs.sh -z localhost:2181 Generate some data by using the squid client to execute http requests (do this about 20 times) squidclient http://www.alamman.com
squidclient http://www.atmape.ru
View the Threat Alerts in Metron UI
When the logs are ingested we get messages that has a hit against threat intel: Notice a couple of characteristics about this message. It has is_alert=true, which designates it as an alert message. Now that we have alerts coming through we need to visualize them in Kibana. First, we need to setup a pinned query to look for messages where is_alert=true: And then once we point the alerts table to this pinned query it looks like this:
... View more
- Find more articles tagged with:
- CyberSecurity
- extensibility
- how-to-tutorial
- How-ToTutorial
- Metron
- threat-intel
Labels:
05-04-2016
09:34 AM
1 Kudo
@Ryan Cicak good question. @drussell is correct. For Metron TP1, to prevent from using another m4.xlarge ec2 instance and give more resources to the other services, we chose only to use 1 zookeeper. But in production, we would have a minimum of at least 3 for the quorum. Given that we are going to be using zookeeper for other services managing metron's own configs (enrichment config, theat intel config, etc..) and in the future support for SOLR will require Zookeeper, possibly more than 3 will be required.
... View more
05-02-2016
05:22 PM
1 Kudo
In previous article of the sereies, Adding a New Telemetry Data Source to Apache Metron, we walked through how to add a new data source squid to Apache Metron. The inevitable next question is how I can enrich the telemetry events in real-time as it flows through the platform. Enrichment is critical when identifying threats or as we like to call it "finding the needle in the haystack". The customers requirement are the following
The proxy events from Squid logs needs to ingested in real-time. The proxy logs has to be parsed into a standardized JSON structure that Metron can understand. In real-time, the squid proxy event needs to be enriched so that the domain named are enriched with the IP information In real-time, the IP with in the proxy event must be checked against for threat intel feeds. If there is a threat intel hit, an alert needs to be raised The end user must be able to see the new telemetry events and the alerts from the new data source. All of this requirements will need to be implemented easily without writing any new java code. In this article, we will walk you through how to do 3. Metron Enrichment Framework Explained Step 1: Enrichment Source Whois data is expensive so we will not be providing it. Instead we wrote a basic whois scraper (out of context for this exercise) that produces a CSV format for whois data as follows: google.com, "Google Inc.", "US", "Dns Admin",874306800000
work.net, "", "US", "PERFECT PRIVACY, LLC",788706000000
capitalone.com, "Capital One Services, Inc.", "US", "Domain Manager",795081600000
cisco.com, "Cisco Technology Inc.", "US", "Info Sec",547988400000
cnn.com, "Turner Broadcasting System, Inc.", "US", "Domain Name Manager",748695600000
news.com, "CBS Interactive Inc.", "US", "Domain Admin",833353200000
nba.com, "NBA Media Ventures, LLC", "US", "C/O Domain Administrator",786027600000
espn.com, "ESPN, Inc.", "US", "ESPN, Inc.",781268400000
pravda.com, "Internet Invest, Ltd. dba Imena.ua", "UA", "Whois privacy protection service",806583600000
hortonworks.com, "Hortonworks, Inc.", "US", "Domain Administrator",1303427404000
microsoft.com, "Microsoft Corporation", "US", "Domain Administrator",673156800000
yahoo.com, "Yahoo! Inc.", "US", "Domain Administrator",790416000000
rackspace.com, "Rackspace US, Inc.", "US", "Domain Admin",903092400000
Cut and paste this data into a file called "whois_ref.csv" on your virtual machine. This csv file represents our enrichment source The schema of this enrichment source is domain|owner|registeredCountry|registeredTimestamp. Make sure you don't have an empty newline character as the last line of the CSV file, as that will result in a pull pointer exception. We need to now configure an extractor config file that describes the enrichment source. {
"config" : {
"columns" : {
"domain" : 0
,"owner" : 1
,"home_country" : 2
,"registrar": 3
,"domain_created_timestamp": 4
}
,"indicator_column" : "domain"
,"type" : "whois"
,"separator" : ","
}
,"extractor" : "CSV"
}
Please cut and paste this file into a file called "extractor_config_temp.json" on the virtual machine. Because copying and pasting from this blog will include some non-ascii invisible characters, to strip them out please run iconv -c -f utf-8 -t ascii extractor_config_temp.json -o extractor_config.json
Step 2: Configure Element to Enrichment Mapping We now have to configure what element of a tuple should be enriched with what enrichment type. This configuration will be stored in zookeeper. The config looks like the following: {
"zkQuorum" : "node1:2181"
,"sensorToFieldList" : {
"squid" : {
"type" : "ENRICHMENT"
,"fieldToEnrichmentTypes" : {
"url" : [ "whois" ]
}
}
}
}
Cut and paste this file into a file called "enrichment_config_temp.json" on the virtual machine. Because copying and pasting from this blog will include some non-ascii invisible characters, to strip them out please run iconv -c -f utf-8 -t ascii enrichment_config_temp.json -o enrichment_config.json Step 3: Run the Enrichment Loader Now that we have the enrichment source and enrichment config defined, we can now run the loader to move the data from the enrichment source to the Metron enrichment Store and store the enrichment config in zookeeper. /usr/metron/0.1BETA/bin/flatfile_loader.sh -n enrichment_config.json -i whois_ref.csv -t enrichment -c t -e extractor_config.json
After this your enrichment data will be loaded in Hbase and a Zookeeper mapping will be established. The data will be populated into Hbase table called enrichment. To verify that the logs were properly ingested into Hbase run the following command: hbase shell
scan 'enrichment'
You should see the table bulk loaded with data from the CSV file. Now check if Zookeeper enrichment tag was properly populated: /usr/metron/0.1BETA/bin/zk_load_configs.sh -z localhost:2181 Generate some data by using the squid client to execute http requests (do this about 20 times) squidclient http://www.cnn.com View the Enrichment Telemetry Events in Metron UI In order to demonstrate the enrichment capabilities of Metron you need to drop all existing indexes for Squid where the data was ingested prior to enrichments being enabled. To do so go back to the head plugin and deleted the indexes like so: Make sure you delete all Squid indexes. Re-ingest the data (see previous blog post) and the messages should be automatically enriched. In the Metron-UI, refresh the dashboard and view the data in the Squid Panel in the dashboard: Notice the enrichments here (whois.owner, whois.domain_created_timestamp, whois.registrar, whois.home_country)
... View more
- Find more articles tagged with:
- CyberSecurity
- enrichment
- extensibility
- FAQ
- how-to-tutorial
- How-ToTutorial
- Metron
Labels:
05-02-2016
05:22 PM
3 Kudos
When adding a net new data source to Metron, the first step is to decide how to push the events from the new telemetry data source into Metron. You can use a number of data collection tools and that decision is decoupled from Metron. However, we recommend evaluating Apache Nifi as it is an excellent tool to do just that (this article uses Nifi to push data into Metron). The second step is to configure Metron to parse the telemetry data source so that downstream processing can be done on it. In this article we will walk you through how to perform both of these steps.
In the previous article of this blog series, we described the following set of requirements for Customer Foo who wanted to add the Squid telemetry data source Into Metron.
The proxy events from Squid logs need to be ingested in real-time.
The proxy logs must be parsed into a standardized JSON structure that Metron can understand.
In real-time, the squid proxy event must be enriched so that the domain names are enriched with the IP information.
In real-time, the IP within the proxy event must be checked for threat intel feeds.
If there is a threat intel hit, an alert needs to be raised.
The end user must be able to see the new telemetry events and the alerts from the new data source.
All of these requirements will need to be implemented easily without writing any new Java code.
In this article, we will walk you through how to perform steps 1, 2, and 6.
How to Parse the Squid Telemetry Data Source to Metron
The following steps guide you through how to add this new telemetry.
Step 1: Spin Up Single Node Vagrant VM
Download the code from https://github.com/apache/incubator-metron/archive/codelab-v1.0.tar.gz.
untar the file ( tar -zxvf incubator-metron-codelab-v1.0.tar.gz).
Navigate to the metron-platform directory and build the package: incubator-metron-codelab-v1.0/metron-platform and build it (mvn clean package -DskipTests=true)
Navigate to the codelab-platform directory: incubator-metron-codelab-v1.0/metron-deployment/vagrant/codelab-platform/
Follow the instructions here: https://github.com/apache/incubator-metron/tree/codelab-v1.0/metron-deployment/vagrant/codelab-platform. Note: The Metron Development Image is named launch_image.sh not launch_dev_image.sh.
Step 2: Create a Kafka Topic for the New Data Source
Every data source whose events you are streaming into Metron must have its own Kafka topic. The ingestion tool of choice (for example, Apache Nifi) will push events into this Kafka topic.
ssh to your VM
vagrant ssh
Create a Kafka topic called "squid" in the directory /usr/hdp/current/kafka-broker/bin/:
cd /usr/hdp/current/kafka-broker/bin/
./kafka-topics.sh --zookeeper localhost:2181 --create --topic squid --partitions 1 --replication-factor 1
List all of the Kafka topics to ensure that the new topic exists:
./kafka-topics.sh --zookeeper localhost:2181 --list
You should see the following list of Kafka topics:
bro
enrichment
pcap
snort
squid
yaf
Step 3: Install Squid
Install and start Squid:
sudo yum install squid
sudo service squid start
With Squid started, look at the the different log files that get created:
sudo su -
cd /var/log/squid
ls
You see that there are three types of logs available: access.log, cache.log, and squid.out. We are interested in access.log becasuse that is the log that records the proxy usage.
Initially the access.log is empty. Let's generate a few entries for the log, then list the new contents of the access.log. The "-h 127.0.0.1" indicates that the squidclient will only use the IPV4 interface.
squidclient -h 127.0.0.1 http://www.hostsite.com
squidclient -h 127.0.0.1 http://www.hostsite.com
cat /var/log/squid/access.log
In production environments you would configure your users web browsers to point to the proxy server, but for the sake of simplicity of this tutorial we will use the client that is packaged with the Squid installation. After we use the client to simulate proxy requests, the Squid log entries should look as follows:
1461576382.642 161 127.0.0.1 TCP_MISS/200 103701 GET http://www.hostsite.com/ - DIRECT/199.27.79.73 text/html
1461576442.228 159 127.0.0.1 TCP_MISS/200 137183 GET http://www.hostsite.com/ - DIRECT/66.210.41.9 text/html
Using the Squid log entries, we can determine the format of the log entires which is:
timestamp | time elapsed | remotehost | code/status | bytes | method | URL rfc931 peerstatus/peerhost | type
Step 4: Create a Grok Statement to Parse the Squid Telemetry Event
Now we are ready to tackle the Metron parsing topology setup.
The first thing we need to do is decide if we will be using the Java-based parser or the Grok-based parser for the new telemetry. In this example we will be using the Grok parser. Grok parser is perfect for structured or semi-structured logs that are well understood (check) and telemetries with lower volumes of traffic (check).
Next we need to define the Grok expression for our log. Refer to Grok documentation for additional details. In our case the pattern is:
WDOM [^(?:http:\/\/|www\.|https:\/\/)]([^\/]+) SQUID_DELIMITED %{NUMBER:timestamp} %{SPACE:UNWANTED} %{INT:elapsed} %{IPV4:ip_src_addr} %{WORD:action}/%{NUMBER:code} %{NUMBER:bytes} %{WORD:method} http:\/\/\www.%{WDOM:url}\/ - %{WORD:UNWANTED}\/%{IPV4:ip_dst_addr} %{WORD:UNWANTED}\/%{WORD:UNWANTED}
Notice the WDOM pattern (that is more tailored to Squid instead of using the generic Grok URL pattern) before defining the Squid log pattern. This is optional and is done for ease of use. Also, notice that we apply the UNWANTED tag for any part of the message that we don't want included in our resulting JSON structure. Finally, notice that we applied the naming convention to the IPV4 field by referencing the following list of field conventions.
The last thing we need to do is to validate the Grok pattern to make sure it's valid. For our test we will be using a free Grok validator called Grok Constructor. A validated Grok expression should look like this:
Now that the Grok pattern has been defined, we need to save it and move it to HDFS. Create a files called "squid" in the tmp directory and copy the Grok pattern into the file.
touch /tmp/squid
vi /tmp/squid
//copy the grok pattern above to the squid file
Now put the squid file into the directory where Metron stores its Grok parsers. Existing Grok parsers that ship with Metron are staged under /apps/metron/patterns/.
su - hdfs
hdfs dfs -put /tmp/squid /apps/metron/patterns/
exit
Step 5: Create a Flux configuration for the new Squid Storm Parser Topology
Now that the Grok pattern is staged in HDFS we need to define Storm Flux configuration for the Metron Parsing Topology. The configs are staged under /usr/metron/0.1BETA/config/topologies/ and each parsing topology has it's own set of configs. Each directory for a topology has a remote.yaml which is designed to be run on AWS and local/test.yaml designed to run locally on a single-node VM. Since we are going to be running locally on a VM we need to define a test.yaml for Squid. The easiest way to do this is to copy one of the existing Grok-based configs (YAF) and tailor it for Squid.
mkdir /usr/metron/0.1BETA/flux/squid
cp /usr/metron/0.1BETA/flux/yaf/remote.yaml /usr/metron/0.1BETA/flux/squid/remote.yaml
vi /usr/metron/0.1BETA/flux/squid/remote.yaml
And edit your config to look like this (replaced yaf with squid and replace the constructorArgs section ):
name: "squid"
config:
topology.workers: 1
components:
- id: "parser"
className: "org.apache.metron.parsers.GrokParser"
constructorArgs:
- "/apps/metron/patterns/squid"
- "SQUID_DELIMITED"
configMethods:
- name: "withTimestampField"
args:
- "timestamp"
- id: "writer"
className: "org.apache.metron.parsers.writer.KafkaWriter"
constructorArgs:
- "${kafka.broker}"
- id: "zkHosts"
className: "storm.kafka.ZkHosts"
constructorArgs:
- "${kafka.zk}"
- id: "kafkaConfig"
className: "storm.kafka.SpoutConfig"
constructorArgs:
# zookeeper hosts
- ref: "zkHosts"
# topic name
- "squid"
# zk root
- ""
# id
- "squid"
properties:
- name: "ignoreZkOffsets"
value: true
- name: "startOffsetTime"
value: -1
- name: "socketTimeoutMs"
value: 1000000
spouts:
- id: "kafkaSpout"
className: "storm.kafka.KafkaSpout"
constructorArgs:
- ref: "kafkaConfig"
bolts:
- id: "parserBolt"
className: "org.apache.metron.parsers.bolt.ParserBolt"
constructorArgs:
- "${kafka.zk}"
- "squid"
- ref: "parser"
- ref: "writer"
streams:
- name: "spout -> bolt"
from: "kafkaSpout"
to: "parserBolt"
grouping:
type: SHUFFLE
Step 6: Deploy the new Parser Topology
Now that we have the Squid parser topology defined, lets deploy it to our cluster.
Deploy the new squid paser topology:
sudo storm jar /usr/metron/0.1BETA/lib/metron-parsers-0.1BETA.jar org.apache.storm.flux.Flux --filter /usr/metron/0.1BETA/config/elasticsearch.properties --remote /usr/metron/0.1BETA/flux/squid/remote.yaml
If you currently have four topologies in Storm, you need to kill one to make a worker available for Squid. To do this, from the Storm UI, click the name of the topology you want to kill in the Topology Summary section, then click Kill under Topology Actions. Storm will kill the topology and make a worker available for Squid.
Go to the Storm UI and you should now see new "squid" topology and ensure that the topology has no errors
This squid processor topology will ingest from the squid Kafka topic we created earlier and then parse the event with Metron's Grok framework using the grok pattern we defined earlier. The result of the parsing is a standard JSON Metron structure that then gets put on the "enrichment" Kafka topic for further processing.
But how does the squid events in the access.log get put into the "squid" Kafka topic such at the Parser topology can parse it? We will do that using Apache Nifi.
Using Apache Nifi to Stream data into Metron
Put simply NiFi was built to automate the flow of data between systems. Hence it is a fantastic tool to collect, ingest and push data to Metron. The below instructions on how to install configure and create the nifi flow to push squid events into Metron.
Install, Configure and and Start Apache Nifi
The following shows how to install Nifi on the VM. Do the following as root:
Download Nifi:
cd /usr/lib
wget http://public-repo-1.hortonworks.com/HDF/centos6/1.x/updates/1.2.0.0/HDF-1.2.0.0-91.tar.gz
tar -zxvf HDF-1.2.0.0-91.tar.gz
Edit Nifi Configuration to update the port of the nifi web app: nifi.web.http.port=8089
cd HDF-1.2.0.0/nifi
vi conf/nifi.properties
//update nifi.web.http.port to 8089
Install Nifi as service
bin/nifi.sh install nifi
Start the Nifi Service
service nifi start
Go to the Nifi Web: http://node1:8089/nifi/
Create a Nifi Flow to stream events to Metron
Now we will create a flow to capture events from squid and push them into metron
Drag a processor to the canvas (do this by the dragging the processor icon..first icon)
Search for TailFile processor and select Add. Right click on the processor and configure. In settings tab change the name to "Ingest Squid Events"
In properties, configure the following like the following:
Drag Another Processor the canvas
Search for PutKafka and select Add
Right click on the processor and configure. In Settings, change names to "Stream to Metron” click the checkbox for failure and success for relationship.
Under properties, set 3 properties
Known Brokers: node1:6667
Topic Name: squid
Client Name: nifi-squid
Create a connection by dragging the arrow from Ingest Squid Events to Stream to Metron
Select the entire Flow and click the play button (play button). you should see all processors green like the below:
Generate some data using squidclient (do this for about 20+ sites)
squidclient http://www.hostsite.com
You should see metrics on the processor of data being pushed into Metron.
Look at the Storm UI for the parser topology and you should see tuples coming in
After about 5 minutes, you should see a new Elastic Search index called squid_index* in the Elastic Admin UI
Verify Events are Indexed
By convention the index where the new messages will be indexed is called squid_index_[timestamp] and the document type is squid_doc.
In order to verify that the messages were indexed correctly, we can use the elastic search Head plugin.
Install the head plugin:
/usr/share/elasticsearch/bin/plugin -install mobz/elasticsearch-head/1.x
You should see the message: Installed mobz/elasticsearch-head/1.x into /usr/share/elasticsearch/plugins/head
2. Navigate to elastic head UI: http://node1:9200/_plugin/head/
3. Click on Browser tab and select squid doc on the left panel and then select one of the sample docs. You should see something like the following:
Configure Metron UI to view the Squid Telemetry Events
Now that we have Metron configured to parse, index and persist telemetry events and Nifi pushing data to Metron, lets now visualize this streaming telemetry data in the Metron UI.
Go to the Metron UI.
Add a New Pinned query
Click the + to add new pinned query
Create a query: _type: squid_doc
Click the colored circle icon, name the saved query and click Pin. See below
Add a new histogram panel for the Squid events
Click the add add panel + icon
Select histogram panel type
Set title as “Squid Events”
Change Time Field to: timestamp
Configure span to 12
In the queries dropdown select “Selected” and only select the “Squid Events” pinned query
Click Save and should see data in the histogram
You should now see the new Squid events
What Next?
The next article in the series covers Enriching Telemetry Data.
... View more
- Find more articles tagged with:
- CyberSecurity
- extensibility
- how-to-tutorial
- How-ToTutorial
- Metron
Labels:
05-02-2016
05:22 PM
3 Kudos
One of the key design principles of Apache Metron is that it should be easily extensible. We envision many users using Metron as a platform and building custom capabilities on top of it; one of which will be to add new telemetry data sources. In this multi-part article series, we will walk you through how to add a new data telemetry data source: Squid proxy logs. This multi-part article series consists of the following:
This Article: Sets up the use case for this multi-part article series Use Case 1: Collecting and Parsing Telemetry Events - This tutorial walks you through how to collect/ingest events into Metron and then parse them. Use Case 2: Enriching Telemetry Data - Describes how to enrich elements of telemetry events with Apache Metron. Use Case 3: Adding/Enriching/Validating with Threat Intel Feeds - Describes how to add new threat intel feeds to the system and how those feeds can be used to cross-reference every telemetry event that comes in. When a hit occurs, an alert will be generated and displayed on the Metron UI. Setting up the Use Case Scenario Customer Foo has installed Metron TP1 and they are using the out-of-the-box data sources (PCAP, YAF/Netflow, Snort, and Bro). They love Metron! But now they want to add a new data source to the platform: Squid proxy logs. Customer Foo's Requirements The following are the customer's requirements for Metron with respect to this new data source:
The proxy events from Squid logs need to be ingested in real-time. The proxy logs must be parsed into a standardized JSON structure that Metron can understand. In real-time, the Squid proxy event needs to be enriched so that the domain names are enriched with the IP information. In real-time, the IP within the proxy event must be checked for threat intel feeds. If there is a threat intel hit, an alert needs to be raised. The end user must be able to see the new telemetry events and the alerts from the new data source. All of these requirements will need to be implemented easily without writing any new Java code. What is Squid? Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages. For more information on Squid see Squid-cache.org. How Metron Enriches a Squid Telemetry Event When you make an outbound http connection to https://www.cnn.com from a given host, the following entry is added to a Squid file called access.log. The following represents the magic that Metron will do to this telemetry event as it is streamed through the platform in real-time: Key Points Some key points to highlight as you go this multi-part article series
We will be adding a net new data source without writing any code. Metron strives for easy extensibility and this is a good example of it. This is a repeatable pattern for a majority of telemetry data sources. Read the next article, on how to collect and push data into Metron and then parse data in the Metron platform: Collecting and Parsing Telemetry Data.
... View more
- Find more articles tagged with:
- CyberSecurity
- extensibility
- How-ToTutorial
- Metron
Labels:
04-28-2016
06:29 PM
on my new mac book pro, i saw the same issue this morning. I followed casey's directions and it fixed it for me. Until we have a better fix, I have updated the documentation to instruction ansible 2.0.0.2 be installed): https://community.hortonworks.com/articles/24818/metron-tech-preview-1-install-instructions-on-sing.html
... View more
04-27-2016
09:22 AM
Good question @Matt McKnight. We will have support for Solr indexing services in Metron TP2 which is slated for end of May. However in TP2, we will still only support Metron UI that is based on Kibana (based on Elastic). This will change in subsequent reelases. So net net, by middle/end of May we will support Solr indexing but you would have to write the UI that calls the SOLR Apis for search queries. Farther down the line, we will provide a custom UI (away from Kibana) that uses SOLR to do search. Make sense?
... View more
04-13-2016
02:07 PM
Good feedback @Hakan Akansel. I updated the article to be more clear on where the event gets persisted.
... View more
04-13-2016
02:01 PM
2 Kudos
@nbalaji-elangovan. This error would indicate that you might not have built all the projects via maven, Can you make sure you ran mvn package -DskipTests from the incubator-metron-Metron_0.1BETA_rc7/metron-streaming directory.
... View more
04-13-2016
01:57 PM
@rmckissick Please send the full ansible.log file located in incubator-metron-Metron_0.1BETA_rc7/deployment/amazon-ec2. When you send the ansible.log please sanitize any ec2 instance names, you don't want to publish out those to the entire community.
... View more
04-12-2016
01:16 PM
I ran into the following error when following these instructions: 2016-04-12 05:42:59,328 p=2472 u=gvetticaden | fatal: [obfuscated_ip]: UNREACHABLE! => {"changed": false, "msg": "SSH encountered an unknown error during the connection. We recommend you re-run the command using -vvvv, which will enable SSH debugging output to help diagnose the issue", "unreachable": true} To fix this issue, see the following thread: https://community.hortonworks.com/questions/24344/aws-unreachable-error-when-executing-metron-instal.html
... View more
04-06-2016
01:33 AM
7 Kudos
Platform Theme Key Features Fully Automated Scripted Install of Metron on AWS One of the largest hurdles we have heard about from the community and customers working with the original OpenSoc code base was that it was nearly impossible to get the application up and running. Hence, our engineering team collaborated with the community to provide a scripted automated install of Metron on AWS. The install only requires the user’s AWS credentials, a set of ansible scripts/playbooks, and Ambari BluePrints / APIs and AWS APIs to deploy the full end to end Metron application. The below table summarizes the steps that occur during the automated install. Step Description Components Deployed Step 1 Spin up EC2 instances where HDP and Metron will be installed and deployed 10 m4.xlarge instances Step 2 Spin up an AWS VPC 1 AWS VPC Step 3 Install Ambari Server and Agents via Ansible Scripts Ambari Server 2.1.2.1 on master node Ambari Agents on slave nodes Step 4 Using Ambari Blueprints and APIS, install 7 Node HDP 2.3 Cluster with the following Services: HDFS, YARN, Zookeeper, Storm, Hbase, and Kafka. The blueprint used to deploy the HDP cluster can be found here: Metron Small Cluster Ambari BluePrint 7 Node HDP Cluster HDP Services: HDFS, YARN, Zookeeper, Storm, HBase & Kafka Step 5 Install 2 Node Elastic Search Cluster 2 Node ES 1.7 Cluster Step 6 Installation and Starting of the following data source probes: BRO, Snort, PCAP probe, YAF (netflow). This entails the following: Install and Start C++ PCAP Probe that captures PCAP data and pushed into Kafka Topic Install and Start YAF probe to capture netflow data Installation of BRO, Kafka Bro Plugin and starting these services Install and Start SNORT with community SNORT rules configured C++ PCAP Probe YAF/Netflow Probe BRO Server and Bro Kafka Plugin Snort Server Step 7 Deployment of 5 Metron Storm Topologies: 4 Parser Topologies for each Data Source supported (PCAP, Bro, YAF, SNORT) 1 Common Enrichment topology Install and Deployment of 5 Storm Topologies Step 8 Configuration of Kafka Topics and Hbase Tables Step 9 Install mySQL to store GeoIP enrichment data. The mySQL DB will be populated with GeoIP information from Maxmind Geolite Install of MySQL with GeoIP information Step 10 Installation of a Metron UI for the SOC Analyst and Investigator persona. Metron UI (Kibana Dashboard) Deployment Architecture After Install The installer will take about 60-90 minutes to execute fully. However, it could vary drastically based on how AWS is feeling during the execution. After the installer finishes, the deployment architecture of the app will look like the following. Metron Storm Topology Refactor / Re-Architecture Another area of focus for Metron TP1 was to address the following challenges with the old OpenSoc Topology architecture which were:
Code was extremely brittle Storm Topologies were designed without taking advantage of full parallelism Numerous“redundant” topologies Management of the app was difficult due to a number of complex topologies Very complex to add new Data Sources to the platform Very little unit and integration Testing Some key re-architecture and refactor work done in TP1 to address these challenges were the following:
Made the Metron code base simpler and easier to maintain by converting all Storm topologies to use flux configuration (declarative way to wire topologies together). Ability to to add new data source parsers without writing code using the Grok Framework parser. Enrichment, model and threat intel intel cross reference are now done in parallel as opposed to sequentially in the storm configuration Minimized the incremental costs of adding new topologies by having one common enrichment topology for all data sources All App configuration is stored in Zookeeper allowing one to manage app config at runtime without stopping the topology Improved code with new unit and integration test harness utilities Old OpenSoc Architecture In the Old OpenSoc Architecture, some key limitations were the following:
For every new data source, a new complex storm topology had to be added Each enrichment, threat intel reference and model execution was done sequentially No in-memory caching for enrichments or threat intel checks No Loader frameworks to load Enrichment or Threat Intel Stores The below diagram illustrates the old architecture. New Metron Architecture With the new Metron Architecture, the key changes are:
Adding a new data source means simply adding new normalizing/parser topology 1 common enrichment topology can be used for all data sources Using the Splitter/Joiner pattern, enrichments/models/threat intel execution is done in parallel Loader frameworks have been added to load the Enrichment and Threat Intel Stores Fast Cache has been added for enrichment and threat intel look ups The below diagram illustrates the new architecture. Telemetry Data Source Theme Key Features PCAP - Packet Capture PCAP represents the most granular data collected in Metron consisting of individual packets and frames. Metron uses a DPDK which provides a set of libraries and drivers for fast packet collection and processing. See the following for more details: Metron Packet Capture Probe Design YAF/Netflow Netflow data represents rolled up PCAP data up to the flow/session level, a summary of the sequence of packets between two machines up to the layer 4 protocol. If one doesn’t want to ingest PCAP due to space constraints and load exerted on infrastructure, then netflow is recommended. Metron uses YAF (Yet Another Flowmeter) to generate IPFIX (Netflow) data from Metrons PCAP robe. Hence the output of the the YAF probe is IPFIX instead of the raw packets. See the following for more details: Metron YAF Capture Design Bro Bro is an IDS (Intrusion Detection System) but Metron uses Bro primarily as a Deep Packet Inspection (DPI) metadata generator.The metadata consists of network activity details up to layer 7 which is application level protocol (DNS, HTTP, FTP, SSH, SSL). Extracting DPI Metadata (layer 7 visibility) is expensive, and thus, is performed only on selected protocols. Hence, the recommendation is to turn on DPI for HTTP and DNS Protocols. Hence, while the PCAP probe records every single packet it sees on the wire, the DPI metadata is extracted only for a subset of these packets. This metadata is one of the most valuable network data for analytics. See the following for more details: Metron Bro Capture Design Snort Snort is a popular Network Intrusion Prevention System (NIPS). Snort monitors network traffic and produces alerts that are generated based on signatures from community rules. Metron plays the output of the packet capture probe to Snort and whenever Snort alerts are triggered Metron uses Apache Flume to pipe these alerts to a Kafka topic. See the following for more details: Metron Snort Capture Design Why are these Network Telemetry Sources Important? A common question is why we focused first on these initial set of network telemetry data sources. Keep in mind that the end vision of Apache Metron is to be an analytics platform. These 4 network telemetry data sources are some of the key data sources required for some of the next generation ML, MLP and statistical models that we are planning to build in future releases. The below table describes some of these models and the data input requirements. Analytics Pack Analytics Pack Description Telemetry Data Source Required Domain Pack A collection of Machine Learning models that identify anomalies for incoming and outgoing connections made to a specific domain that appear to be malicious Bro UEBA Pack A collection of Machine Learning models that monitor assets and users known to belegitimate to identify anomalies from their normal behavior. Bro User Enrichment Asset Enrichment User Auth Logs Asset Inventory Logs Relevancy/Correlation Engine Pack A collection of Machine Learning models that identify alerts that are related within the massive volumes of alerts being processed by the cyber solutions. Snort Surracata Third Party Alerts Protocol Anomaly Pack A collection of Machine Learning models that identifies if there anything unusual about network traffic monitored via deep packet inspection (PCAP) PCAP YAF/Netflow Bro The system is configurable so that one can enable only the data sources of interest. In future Metron tech previews, we will be adding support for these types of security data sources:
FireEye Palo Alto Network Active Directory BlueCoat SourceFire Bit9 CarbonBlack Lancope Cisco ISE Real-time Data Processing Theme Key Features Enrichment Services The below diagram illustrates the Enrichment framework that was built in Metron TP1. The key components of the framework are:
Enrichment Loader Framework - A framework that bulk loads or polls data from an enrichment source. The framework supports plugging in any enrichment source Enrichment Store - The Store where all enrichment data is stored. HBase will be the primary store. The store will also provide services to de-dup and age data. Enrichment Bolt - A Storm Bolt that enriches metron telemetry events Enrichment Cache - Cache used by the bolt so that look ups to the enrichment store is cache The specific enrichments supported in Metron TP1 is below.
Enrichment Description Enrichment Source, Store, Loader Type, Refresh Rate Metron Message Field Name that will Enriched GeoIP Tags on GeoIP (lat-lon coordinates + City/State/Country) to any external IP address. This can be applied both to alerts as well as metadata telemetries to be able to map them to a geo location. Enrich Source: Maxmind Geolite Metron Store: MySQL (Will Use HBase in next TP) Loader Type: Bulk load from HDFS Refresh Rate: Every 3 months Src_ip, dest_ip Host Enriches IP with Host details Enrich Source: Enterprise Inventory/Asset Store Metron Store: HFDS Loader Type: Bulk load from HDFS dest_ip More details can be found here: Metron Enrichment Services Threat Intel Services The Threat Intel framework is very similar to the Enrichment framework. See below architecture diagram. The specific threat intel services supported in TP1 is below.
Threat Feed Feed Description Feed Format Refresh Rate Soltra Threat Intel Aggregator Stix/Taxii Poll every 5 minutes Hail a Taxi Repository of Open Source Cyber Threat Intellegence feeds in STIX format. Stix/Taxii Poll every 5 minutes More details can be found here: Metron Threat Intel Services
... View more
- Find more articles tagged with:
- CyberSecurity
- How-ToTutorial
- Metron
- tech-preview
Labels:
04-06-2016
12:48 AM
9 Kudos
Metron TP1 Features The following are key capabilities available in Metron TP1 broken up across its four key functional themes. How do I get Started? You can spin up the Metron TP1 in two ways:
Ansible based Vagrant Single Node VM Install
This the best place to play with Metron First. Detailed instructions how to do the install can be found in the following HCC Article: Apache Metron TP 1 Install Instructions- Single Node Vagrant Deployment
Fully Automated 10 Node Ansible Based Install on AWS using Ambari Blueprints and AWS APIs
If you want a more realistic setup of the Metron app, use this approach. Keep in mind that this install will spin up 10 m4.xlarge EC2 instance by default Detailed instructions how to do the install can be found in the following HCC Article: Apache Metron - First Steps in the Cloud Where do I get Help? Hortonworks has created new Track called CyberSecurity in the Hortonworks Community Connection (HCC). The link to the this new track in HCC is the following: HCC CyberSecurity Track. Apache Metron committers are subscribed to this track and are constantly monitoring it for any questions the community has on TP1. When asking a question about Metron TP1, please select the “CyberSecurity” Track and add the following tags: “Metron” and “tech-preview”. Platform Theme Features of Metron TP1 The below is a summary of the key platform features added in TP1: Feature Related Apache Metron JIRAS Support for HDP 2.3 Refactor Metron Topologies for Performance, Easier Manageability & Supportability METRON-56 METRON-33 Fully Automated Install of Metron on AWS on multi-node HDP cluster via Ansible scripts, Ambari blueprints and APIs. METRON-59 METRON-77 METRON-76 METRON-69 METRON-63 METRON-61 METRON-43 METRON-2 Single Node Vagrant Support for Metron for Development METRON-21 Unit and Integration Testing Frameworks, Code Test Coverage METRON-82 METRON-58 METRON-37 METRON-28 Telemetry Data Source Theme Features of Metron TP1 Metron TP1 focus is network telemetry data sources as described below. They represent the most valuable granular data one can collect and perform next generation analytics on. The Key Data collection features for Metron TP1 are the following: Feature Related Apache Metron JIRAS PCAP Ingest Data Services - Performant C++ probe that captures network packet and streams them into Kafka and gets bulk loaded into Metron METRON-79 METRON-79 METRON-73 METRON-55 METRON-39 YAF/Netflow Ingest Data Services - Ingests netflow data into Metron METRON-67 METRON-60 Bro Ingest Data Services - Custom BRO plugin that pushes out DPI (Deep Packet Inspection) metadata into Metron METRON-25 METRON-73 METRON-64 Snort Ingest Data Services - Stream snort generated alerts via Flume into Metron METRON-57 Grok Framework - Ability to add new Data Sources to Metron without writing new Parsing Topologies. For each new data source, grok expression file can be provided to normalized into Metron Event. METRON-66 Real-time Data Processing Theme Features of Metron TP1 For this theme, the key features in Metron TP1 are the following: Feature Related Apache Metron JIRAS Enrichment Services - OOO support for GeoIP and Host enrichments, extensible framework to plug-in new enrichments, & management Utilities for Enrichment Data METRON-32 METRON-43 Threat Intel Services - Integration with Soltra (Threat Intel Aggregrator) and Hail a Taxii, management Utilities for Threat Intel (Streaming and Bulk Load, aging out of data) METRON-35 METRON-50 Alerting Services - Alerts can be fired via a snort event or intel threat feed hit Indexing Services - Support for indexing via ElasticSearch METRON-36 METRON-56 METRON-66 Storage Services - persisting all enrichment telemetry data in HDFS and or HBase METRON-62 METRON-22 UI Theme Features of Metron TP1 There was less focus on the UI Theme but Metron TP1 does provide the following new UI features: Feature Related Apache Metron JIRAS Metron Investigator IO Dashboard for the SOC Analyst and Investigator Personas built on top of Kibana METRON-72 METRON-77 METRON-81 Histogram Panels for each of the data sources (YAF, Bro, Snort, PCAP) METRON-60 METRON-52 PCAP panel allow you to search for and download PCAP files METRON-72 METRON-77 METRON-81 Ability to customize the Metron UI with different data sources and different panel types. METRON-72 METRON-77 METRON-81
... View more
- Find more articles tagged with:
- CyberSecurity
- How-ToTutorial
- Metron
- tech-preview
Labels:
04-05-2016
11:04 PM
3 Kudos
Metron User Personas There are six user personas for Metron: Persona Name Description SOC Analyst Profile: Beginner, Junior-level analyst Tools Used: SIEM tools/dashboards, Security endpoint UIs, Email/Ticketing/Workflow Systems
Responsibilities: Monitor security SIEM tools, search/investigate breaches, malware, review alerts and determine to escalate as tickets or filter out, follow security playbooks, investigate script kiddie attacks. SOC Investigator Profile: More advanced SME in cybersecurity, Experienced security analyst, understands more advanced features of security tools, thorough understanding of networking and platform architecture (routers, switches, firewalls, security), Ability to dig through and understand various logs (Network, firewall, proxy, app, etc..)
Tools Used: SIEM/Security tools, Scripting languages, SQL, command line
Responsibilities: Investigate more complicated/escalated alerts, investigate breaches, Takes the necessary steps to remove/quarantine the malware, breach or infected system, hunter for malware attacks, investigate more complicated attacks like ADT (Advanced Persistent Threats) SOC Manager Profile: Experience managing teams, security practitioner that has moved into management.
Tools Used: Workflow Systems (e.g: Remedy, JIRA), Ticket/Alerting Systems
Responsibilities: Assigns Metron Cases to Analysts. Verifies “completed” metron cases. Forensic Investigator Profile: E-discovery experience with security background.
Tools Used: SIEM and e-discovery tools
Responsibilities: Collect evidence on breach/attack incident, prepare lawyer’s response to breach, Security Platform Operations Engineer Profile: Computer Science, developer, and/or Dev/Ops Background. Experience with Big Data technologies and supported distributed applications/systems
Tools Used: Security Tools (SIEM, endpoint solutions, UEBA solutions), provisioning, management and monitoring tooling, various programming languages, Big Data and distributing computing platforms.
Responsibilities: Helps vet different security tools before bringing them into the enterprise. Establishes best practices and reference architecture with respect to provisioning, management and use of the security tools/ configures the system with respect to deployment/monitoring/etc. Maintains the probes to collect data, enrichment services, loading enrichment data, managing threat feeds, etc..Provides care and feeding of one or more point security solutions. Does capacity planning, system maintenance and upgrades. Security Data Scientist Profile: Computer Science / Math Background, security domain experience, dig through as much data as available and looks for patterns and build models
Tools Used: Python (scikit learn, Python Notebook), R, Rstudio, SAS, Jupyter, Spark (SparkML)
Responsibilities: Work with security data performing data munging, visualization, plotting, exploration, feature engineering and generation, trains, evaluates and scores models Why Metron? SOC Analyst & Investigator Perspective The above diagram illustrates the key steps in a typical analyst/investigator workflow. For certain steps in this workflow, Apache Metron provides keys capabilities not found in traditional security tools: Looking through Alerts
Centralized Alerts Console - Having a centralized dashboard for alerts and the telemetry events associated with the alert across all security data sources in your enterprise is a powerful feature within Metron that prevents the Analyst from jumping from one console to another. Meta Alerts - The long term vision of Metron is to provide a suite of analytical models and packs including Alerts Relevancy Engine and Meta-Alerts. Meta Alerts are generated by groupings or analytics models and provide a mechanism to shield the end user from being inundated with 1000s of granular alerts. Alerts labeled with threat intel data - Viewing alerts labeled with threat intel from third party feeds allows the analyst to decipher more quickly which alerts are legitimate vs false positives. Collecting Contextual data
Fully enriched messages - Analyst spend a lot of time manually enriching the raw alerts or events. With Metron, analysts work with the fully enriched message. Single Pane of Glass UI - Single pane of glass that not only has all alerts across different security data sources but also the same view that provides the enriched data Centralized real-time search - All alerts and telemetry events are indexed in real-time. Hence, the analyst has immediate access to search for all events. All logs in one place - All events with the enrichments and labels are stored in a single repository. Investigate
Granular access to PCAP - After identifying a legitimate threat, more advanced SOC investigators want the ability to download the raw packet data that caused the alert. Metron provides this capability. Replay old PCAP against new signatures - Metron can be configured to store raw pcap data in Hadoop for a configurable period of time. This corpus of pcap data can then be replayed to test new analytical models and new signatures. Tag Behavior for modeling by data scientists Raw messages used as evidentiary store Asset inventory and User Identity as enrichment sources. Note that the above 3 steps in the analyst workflow make up approximately 70% of the time. Metron will drastically decrease the analyst workflow time spend because everything the SOC analyst needs to know is in a single place. Why Metron? Data Scientist Perspective The above diagram illustrates the key steps in a typical data science workflow. For certain steps in this workflow, Apache Metron provides key capabilities not found in traditional security tools: Finding the data
All my data is in the same place - One of the biggest challenges faced by security data scientists is to find the data required to train and evaluate the score models. Metron provides a single repository where the enterprise’s security telemetry data are stored. Data exposed through a variety of APIs - The Metron security vault/repository provides different engines to access and work with the data including SQL, scripting languages, in-memory, java, scala, key-value columnar, REST APIs, User Portals, etc.. Standard Access Control Policies - All data stored in the Metron security vault is secured via Apache Ranger through access policies at a file system level (HDFS) and at processing engine level (Spark, Hive, HBase, Solr, etc..) Cleaning the data Metron normalizes telemetry events - As discussed in the first blog where we traced an event being processed by the platform, Metron normalizes all telemetry data into at least a standard 7 tuple json structure allowing data scientists to find and correlate data together more easily. Partial schema validation on ingest - Metron framework will validate data on ingest and will filter out bad data automatically which is something that data scientists, traditionally, spend a lot time doing. Munging Data Automatic data enrichment - Typically data scientists have to manually enrich data to create and test features or have to work with the data/platform team to do so. With Metron, events are enriched in real-time as it comes in and the enriched event is stored in the Metron security vault. Automatic application of class labels - Different types of metadata (threat intel information, etc…) is tagged on to the event which allows the data scientists to create feature matrixes for models more easily. Massively parallel computation framework - All the cleaning and munging of the data is using distributed technologies that allows the processing of these high velocity/ large volumes to be performant and scalable. Visualizing Data Real-time search + UI - Metron indexes all events and alerts and provides UI dashboard to perform real-time search. Apache Zeppelin Dashboards - Out of the box Zeppelin dashboards will be available that can be used by SOC analysts. With Zeppelin you can share the dashboards, substitute variables, and can quickly change graph types. An example of a dashboard would be to show all HTTP calls that resulted in 404 errors, visualized as a bar graph ordered by the number of failures. Integration with Jupyter - Jupyter notebooks will be provided to data scientists for common tasks such as exploration, visualization, plotting, evaluating features, etc.. Note that the above 4 steps in the data science workflow make up approximately 80% of the time. Metron will drastically reduce the time from hypothesis to model for the data scientist. Apache Metron Core Functional Themes Now that we have understanding of Metron’s user personas, we will now describe the four core functional themes that Metron will focus on. As the community around Metron continues to group, new features and enhancements will be prioritized across these four themes. The 4 core functional themes are the following: Apache Metron Release 0.1 and its Target Personas and Themes Over the last 4 months, the community led by Hortonworks, has been hard at work on Apache Metron’s first release (Metron 0.1) Now that we have described the User Personas and core themes for Metron, the following depicts where the engineering focus has been for Metron 0.1. As the diagram above illustrates, the key focus areas for Metron 0.1 are the following:
The Platform theme was the primary focus.. Before we can focus on the UI and supporting more telemetry data sources, we need to ensure that the platform is rock hard. This means ensuring an easy way to provision this very complex app and refactor/re-architecture work to ensure code is simpler and easier to maintain, adding new data sources in a declarative manner, performance and extensible improvements and improving the quality of the code. The persona of focus is the Security Platform Engineer. Metron 0.1 offers dashboard views for the SOC Analyst and SOC investigator.
... View more
- Find more articles tagged with:
- CyberSecurity
- How-ToTutorial
- Metron
- tech-preview
- user-personas
Labels:
04-05-2016
10:11 PM
14 Kudos
Apache Metron Explained Apache Metron is a cyber security application framework that provides organizations the ability to ingest, process and store diverse security data feeds at scale in order to detect cyber anomalies and enable organizations to rapidly respond to them. As the diagram above indicates, the Metron framework provides 4 key capabilities: Security Data Lake / Vault - Platform provides cost effective way to store enriched telemetry data for long periods of time. This data lake provides the corpus of data required to do feature engineering that powers discovery analytics and provides a mechanism to search and query for operational analytics. Pluggable Framework - Platform provides not only a rich set of parsers for common security data sources (pcap, netflow, bro, snort, fireye, sourcefire) but also provides a pluggable framework to add new custom parsers for new data sources, add new enrichment services to provide more contextual info to the raw streaming data, pluggable extensions for threat intel feeds, and the ability to customize the security dashboards. Security Application - Metron provides standard SIEM like capabilities (alerting, threat intel framework, agents to ingest data sources) but also has packet replay utilities, evidence store and hunting services commonly used by SOC analysts. Threat Intelligence Platform - Metron will provide next generation defense techniques that consists of using a class of anomaly detection and machine learning algorithms that can be applied in real-time as events are streaming in. Tracing the Flow of a Security Telemetry Event though Metron The below diagram depicts the logical components of the Metron Platform. The below subsection traces an event as it flows through these different logical components. Step 1a - Telemetry Ingest For most security telemetry data sources that uses transports and protocols like file, syslog, REST, HTTP, custom API, etc., Metron will use Apache Nifi to ingest data at the source. An example would be capturing data from a FireEye appliance with Nifi’s SysLog Processor. The raw Fireye event captured would look something like the following: Step 1b - Fast Telemetry Ingest For high volume network telemetry data like packet capture (PCAP), Netflow/YAF, and Bro/DPI, custom Metron probes will be available to ingest data directly from a network tap. An example would be capturing Bro data using the custom C++ Metron probe. The raw Bro event captured by the Bro probe would look something like the following:
Step 2 - Telemetry Ingest Buffer All raw events from each telemetry security data source captured by Apache Nifi or custom Metron probe will be pushed into its own Kafka topic. The arrival of a telemetry event into the ingest buffer marks the start of where the Metron processing begins. Step 3 - Process (Parse, Normalize, Validate and Tag) Each raw event will be parsed and normalized into a standardized flat JSON structure. Every event will be standardized into at least a 7-tuple JSON structure. This is done so the topology correlation engine further downstream can correlate messages from different topologies by these fields. The standard field names are as follows: ip_src_addr: layer 3 source IP ip_dst_addr: layer 3 dest IP ip_src_port: layer 4 source port ip_dst_port: layer 4 dest port protocol: layer 4 protocol timestamp (epoch) original_string: A human friendly string representation of the message At this step, one can also validate the raw event and tag it with additional metadata which will be used by downstream processing. After Step 3, the raw Bro event will look like the following:
Step 4 - Enrich Once the raw security telemetry event has been parsed and normalized, the next step is to enrich different data elements of the normalized event. Examples of enrichment are GEO where an external IP address is enriched with GeoIP information (lat/long coordinates + City/State/Country) or HOST enrichment where an IP gets enriched with Host details (e.g: IP corresponds to Host X which is part of a web server farm for an e-commerce application). After Step 4, the enriched Bro event will look something like the following:
Step 5 - Label After enrichment, the telemetry event goes through the labeling process. Actions done within this phase include threat intel cross reference checks where elements within the telemetry event can be used to do look ups against threat intel feed data sources like Soltra produced Stix/Taxii feeds or other threat intel aggregator services. These threat intel services will then “label” the telemetry event with threat intel metadata when a hit occurs. Other types of services include executing/scoring analytical models using model as a service pattern with the telemetry events that are flowing in (more details on Analytical Models/Packs and Model as Service patterns will be coming in upcoming blogs of this series). After step 5 assuming the bro telemetry event had a threat intel hit, the message would look something like the following:
Step 6 - Alert and Persist During this phase, certain telemetry events can initiate alerts. These types of telemetry events are then indexed in an alert index store. A telemetry event can spawn an alert triggered by a number of factors including: The event type - The raw telemetry event itself is an alert. For example, any event generated by Snort is an alert so it will automatically be indexed as an alert. Threat intel hit - If raw telemetry event has a threat intel hit, it will be marked as an alert. Also during this step, all enriched and labeled telemetry events are indexed and persisted in Hadoop for long term storage. The storage of these events in Hadoop produces a security data vault within the enterprise that enables next generation analytics to be performed. After step 6, the telemetry event is stored in HDFS and indexed in Elastic/Solr based on configuration. The persisted event in HDFS looks something like the following:
Step 7 - UI Portal and Data & Integration Services Steps 1 through 6 provide the mechanism to ingest, parse, normalize, enrich, label, index and store all security telemetry data across a diverse set of data sources in your enterprise into a single security data vault. This allows the Metron platform to provide a set of services for different types of security users to perform their jobs more effectively. Some of these services include: Real-time Search and Interactive Dashboards / Portals - Single Pane of glass for security operation analysts to view alerts and correlate alerts to the granular telemetry events that caused the alert. Data Modeling / Feature Engineering Services - Since the Metron framework normalizes and enriches the data and stores it into the security data lake (HDFS, Hbase) in standardized locations, then various analytical models can be provided by the platform. These models will have specifications for the feature matrix required, and hence, the process of feature engineering which is the most complex aspect of analytics becomes considerably simplified. Data Modeling services required for the feature matrix will be provided by tools such as Jupyter, IPython and Zeppelin. Integration and Extensibility Layers - One of the most powerful features of the Metron platform is the ability to customize it for your own needs/requirements which includes:
Ingesting new data sources Adding new parsers Adding new enrichment services Adding new Threat Intel feeds Building, deploying and executing new analytical models Integration with enterprise workflow engines Customizing the Security Dashboards and portals Recap You should now have a better understanding of the history of Apache Metron and the high level capabilities of the platform. The next blog in this series will walk you through the different types of users we envision for Apache Metron, the core functional themes, and what the Metron community has been focusing on for the last few months.
... View more
- Find more articles tagged with:
- CyberSecurity
- How-ToTutorial
- Metron
- tech-preview
Labels:
04-05-2016
10:11 PM
6 Kudos
Hello from the Metron PM and Eng Team Today, the Hortonworks Metron product management and engineering team are kicking off a multi-part blog series on Apache Metron, the next gen security analytics application that Hortonworks is building working with the Apache Community. Over the course of the next few weeks, we will release a series of blogs that covers the following topics: Part 1 - Apache Metron Explained - Overview of Apache Metron and traces a security telemetry event as it flows through the platform. Part 2 - Apache Metron User Personas and Why Metron? - Who will be the different users of Apache Metron? What are the core functional themes? What has been the focus for the first release? We will address all 3 of these questions in this blog. Part 3 - Apache Metron Tech Preview 1 - Come and Get It We will walk you through what the Metron community has been working on for the last 3 months. By the end of this blog, you will have a good understanding of what is in Metron Tech Preview 1 and how to get it installed, deployed and building on top of it. Part 4: Apache Metron UI and Finding a Needle in the Haystack Use Case - We will walkthrough the Metron UI components and how SOC Analyst would use it for common Metron Use Cases. Part 5 - Deep Dive on Apache Metron Tech Preview 1 - We will double click on the major functional areas of Metron TP 1. Part 6 - Apache Metron Vision - With a solid understand of what TP1 consists of, this blog will provide a glimpse into the roadmap and vision for Apache Metron and what the project will look like by the end of 2016 focusing on the analytics work planned. Roots of Apache Metron To understand Apache Metron, we have to first start with the origins of the project which emerged from the Cisco Project called OpenSoc. The below diagram highlights some of the key events in the history of Apache Metron starting with Cisco OpenSoc. 2005 to 2008
The Problem - Cyber crime spiked significantly and a severe shortage of security talent arose. The first set of companies alerted to this issue are high profile banks and large organizations with interesting proprietary information to state sponsored agents. All of the best investigators and analysts were gobbled up by multinational banking and financial services firms, large hospitals, telcos, and defense contractors.
The Rise of a New Industry, the Managed SOC - Those who could not acquire security talent were still in need of a team. Cisco was sitting on a gold mine of security talent that they had accumulated over the years. Utilizing this talent, they produced a managed service offering around managed security operations centers. Post 2008
The Age of Big Data Changed Everything - The Age of Big Data arrived, bringing more streaming data, virtualized infrastructure, data centers emitting machine exhaust from VMs, and Bring Your Own Device programs. The amount of data exploded and so did the cost of the required tools like traditional SIEMs. These tools became cost prohibitive as they changed to data driven licensing structures. Cisco’s ability to operate the managed SOC with these tools was in jeopardy and security appliance vendors took control of the market. 2013
OpenSOC is Born and Hadoop Matures - Cisco decided to build a toolset of their own. They didn’t just want to replace these tools but they wanted to improve and modernize them, taking advantage of open source. Cisco released its managed SOC service to the community as Hadoop matured and Storm became available. It was a perfect combination of a use case need and technology. OpenSOC was the first project to take advantage of Storm, Hadoop, and Kafka, as well as migrate the legacy ways into a forward thinking future type paradigm. September 2013 thru April 2015
The Origins of Apache Metron - For about 24 months, a Cisco team, led by their chief data scientist James Sirota, with the help of a Hortonworks team, led by platform architect Sheetal Dolas, worked to create a next generation managed SOC service built on top of open source big data technologies. The Cisco OpenSOC managed SOC offering went into production for a number of customers in April of 2015. A short time after, Cisco made a couple of acquisitions that brought in third party technologies transforming OpenSOC into a closed source, hardware based version. October 2015
OpenSOC Chief Data Scientist Joins Hortonworks - James Sirota, the chief data scientist and lead of the Cisco OpenSOC initiative, leaves Cisco to join Hortonworks. Over the course of the next 4 months, James starts to build a rock star engineering team at Hortonworks with the focus of building an open-source CyberSecurity application. December 2015
Metron Accepted into Apache Incubation - Hortonworks, with the help and support of key Apache community partners, including ManTech, B23 and others, submit Metron (renamed from OpenSOC) as an Apache incubator project. In December of 2015, the project is accepted into Apache incubation. Hortonworks and the community innovate at impressive speeds to add new features to Apache Metron and harden the platform. The Metron team builds an extensible, open architecture to account for the variety of tools used in customer environments (thousands of firewalls, thousands of domains and a multitude of Intrusion Detection Systems). Metron’s open approach makes it much easier to tailor to the community’s use cases. April 2016
First official Release of Apache Metron 0.1 - After 4 months of hard work and rapid innovation by the Metron community, Apache Metron’s first release Metron 0.1 is cut.
Given Hortonworks proven commitment to the Apache Software Foundation process and our track record for creating and leading robust communities, we feel uniquely qualified to bring this important technology and its capabilities to the broader open source community. Without Hortonworks, the Apache Metron project would not exist today!
... View more
- Find more articles tagged with:
- CyberSecurity
- How-ToTutorial
- Metron
- tech-preview
Labels:
03-28-2016
07:40 PM
8 Kudos
Introduction If you are new to Metron or the Metron Tech Preview 1, the following links should provide some good information to review before walking through the installation:
Intro to Apache Metron What is in Apache Metron Tech Preview 1 Build Instructions The following steps provide instructions on how to install a full working Metron application on a single node VM with Vagrant. This deployment option is ideal for experimenting and playing with the Metron application. While these instructions should work on most development environments, these instructions were tested on Mac OS X El Capitan. Prerequisites On your Macintosh:
Install the latest version of Virtual Box.
Install the latest version of Vagrant.
Install Maven if you don't have it, and define associated environmental variables. For example, add the following to your ~/.bash_profile file: export MAVEN_HOME=/Users/rmckissick/Documents/Files/apache-maven-3.3.9
export PATH=$MAVEN_HOME/bin:$PATH Install JAVA 1.8 if you don't have it, and define associated environment variables. For example, add the following your ~/.bash_profile file. export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_91.jdk/Contents/Home
export PATH=$JAVA_HOME/bin:$PATH If you installed Maven and Java and edited your profile file in steps 2 and 3, reload .bash_profile: source~/.bash_profile Check your Maven installation: mvn–version You
should see information about Maven, Java, and OS X.
Install Ansible, version 2.0 or greater.
For example: sudo su -
easy_install pip
export CFLAGS=-Qunused-arguments
export CPPFLAGS=-Qunused-arguments
pip install ansible
exit
(exit logs off from root and returns to your user account) Build Apache Metron
Download the 0.1 Metron binaries from here (download the .tar.gz file).
Untar the binaries to a location that will be easy to find later: tar -zxvf apache-metron-0.1BETA-RC7-incubating.tar.gz Build the Metron application: cd incubator-metron-Metron_0.1BETA_rc7
mvn apache-rat:check && cd metron-streaming && mvn clean integration-test && cd ..
The mvn command downloads and builds Metron components. It should take about 15 minutes, depending on your hardware configuration. When it finishes, you should see a message similar to the following: [INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Metron-Streaming ................................... SUCCESS [ 31.437 s]
[INFO] Metron-Common ...................................... SUCCESS [04:58 min]
[INFO] Metron-EnrichmentAdapters .......................... SUCCESS [ 14.185 s]
[INFO] Metron-MessageParsers .............................. SUCCESS [ 2.704 s]
[INFO] Metron-Indexing .................................... SUCCESS [ 26.989 s]
[INFO] Metron-Alerts ...................................... SUCCESS [ 4.651 s]
[INFO] Metron-Testing ..................................... SUCCESS [ 9.167 s]
[INFO] Metron-DataLoads ................................... SUCCESS [04:26 min]
[INFO] Metron-Topologies .................................. SUCCESS [03:05 min]
[INFO] Metron-Pcap_Service ................................ SUCCESS [ 43.666 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 14:43 min
[INFO] Finished at: 2016-04-26T13:11:09-07:00
[INFO] Final Memory: 122M/1649M Deploy Metron as a single VM via Vagrant and Ansible: cd deployment/vagrant/singlenode-vagrant
vagrant plugin install vagrant-hostmanager
vagrant up
The vagrant up process will run through a series of Ansible scripts,
installing Ambari, HDP, and Metron on the single-node VM. The process should
take about 45 - 60 minutes depending on your hardware configuration.
Verify That Apache Metron is Deployed Successfully
Check Ambari to make sure all the services are up by going to Ambari. Sign on with the default login and password "admin". The Ambari dashboard should look like the following: Verify that four Storm topologies have been deployed: bro, enrichment, snort, and yaf. From Ambari, navigate to Storm -> Quick Links -> Storm UI. You should see the four storm topologies deployed. The Metron Storm UI should look something like the following:
Check that the enrichment topology has emitted some data (this could take a few minutes to show up in the Storm UI). The storm enrichment topology UI should look something like the following: Go to the Metron UI (at http://node1:5000). Check indexes to make sure indexing is done correctly and data is visualized. The Metron UI should look something like the following: Check that some data is written into HDFS for at least one of the data sources vagrant ssh node1
sudo su hdfs
hadoop fs -ls /apps/metron/enrichment/indexed Questions/Issues If you have any questions or install issues, post your question to the CyberSecurity HCC Track.
... View more
- Find more articles tagged with:
- CyberSecurity
- How-ToTutorial
- Installation
- Metron
- tech-preview
Labels:
03-24-2016
03:07 PM
1 Kudo
See the following on how to add test alerts via snort: https://cwiki.apache.org/confluence/display/METRON/Adding+Dummy+Snort+Data+for+Load+Testing Once you follow hose instructions, you should now see test snort alerts in the Alerts Panel. See screenshot.
... View more
03-24-2016
03:04 PM
1 Kudo
I ran the Metron Installer for AWS. The Metron UI dashboard shows no alerts How do I generate some test alerts?
... View more
Labels:
- Labels:
-
Apache Metron
03-24-2016
02:54 PM
I logged into one of the ec2 nodes where an hdp client was installed and after switching to hdfs I deleted the following folder in HDFS and re-ran the installer. This fixed the issue for me. hadoop fs -rmr /apps/metron/patterns hadoop fs -rmr /apps/metron/enrichments
... View more
03-24-2016
02:52 PM
1 Kudo
I ran into an issue when i ran the Metron Installer on AWS based on these instructions: https://github.com/apache/incubator-metron/tree/Metron_0.1BETA_rc5/deployment/amazon-ec2 I fixed that issue and I re-ran the installer via the command: ansible-playbook -i ec2.py playbook.yml --skip-tags="wait" However, then I ran into the following error: 03-24 06:22:36,900 p=68310 u=gvetticaden | fatal: [ec2-54-186-178-244.us-west-2.compute.amazonaws.com]: FAILED! => {"changed": true, "cmd": ["hdfs", "dfs", "-put", "/usr/metron/0.1BETA/config/patterns", "/apps/metron"], "delta": "0:00:02.300088", "end": "2016-03-24 11:22:36.562397", "failed": true, "rc": 1, "start": "2016-03-24 11:22:34.262309", "stderr": "put: `/apps/metron/patterns/asa': File exists\nput: `/apps/metron/patterns/common': File exists\nput: `/apps/metron/patterns/fireeye': File exists\nput: `/apps/metron/patterns/sourcefire': File exists\nput: `/apps/metron/patterns/yaf': File exists", "stdout": "", "stdout_lines": [], "warnings": []}
2016-03-24 06:22:36,904 p=68310 u=gvetticaden | to retry, use: --limit @playbook.retry
... View more
Labels:
- Labels:
-
Apache Metron
03-24-2016
02:47 PM
1 Kudo
I solved the problem by upgrading my virtual box from 4.2.4 to 5.0.16. Ensure that you have the latest virtual box.
... View more
03-24-2016
02:46 PM
I'm trying to install the single node vagrant installer. https://github.com/apache/incubator-metron/tree/Metron_0.1BETA_rc5/deployment/vagrant/singlenode-vagrant Steps that I ran were the following: Downloaded the RC_5 tech preview candiate here; http://home.apache.org/~jsirota/metron/Metron_0.1BETA_RC/RC_5/ cd incubator-metron Built source: mvn apache-rat:check && cd metron-streaming && mvn clean integration-test
&& cd .. cd deployment/vagrant/singlenode-vagrant
Ran vagrant scripts: vagrant plugin install vagrant-hostmanager vagrant up The error that I'm getting when running vagrant up is the following Georges-MacBook-Pro-3:singlenode-vagrant gvetticaden$ vagrant up
Bringing machine 'node1' up with 'virtualbox' provider...
==> node1: Box 'bento/centos-6.7' could not be found. Attempting to find and install...
node1: Box Provider: virtualbox
node1: Box Version: >= 0
==> node1: Loading metadata for box 'bento/centos-6.7'
node1: URL: https://atlas.hashicorp.com/bento/centos-6.7
==> node1: Adding box 'bento/centos-6.7' (v2.2.3) for provider: virtualbox
node1: Downloading: https://atlas.hashicorp.com/bento/boxes/centos-6.7/versions/2.2.3/providers/virtualbox.box
==> node1: Successfully added box 'bento/centos-6.7' (v2.2.3) for 'virtualbox'!
==> node1: Importing base box 'bento/centos-6.7'...
==> node1: Matching MAC address for NAT networking...
==> node1: Checking if box 'bento/centos-6.7' is up to date...
==> node1: Setting the name of the VM: singlenode-vagrant_node1_1458751896067_60751
==> node1: Clearing any previously set network interfaces...
There was an error while executing `VBoxManage`, a CLI used by Vagrant
for controlling VirtualBox. The command and stderr is shown below.
Command: ["hostonlyif", "create"]
Stderr: 0%...
Progress state: NS_ERROR_FAILURE
VBoxManage: error: Failed to create the host-only adapter
VBoxManage: error: VBoxNetAdpCtl: Error while adding new interface: failed to open /dev/vboxnetctl: No such file or directory
VBoxManage: error: Details: code NS_ERROR_FAILURE (0x80004005), component HostNetworkInterface, interface IHostNetworkInterface
VBoxManage: error: Context: "int handleCreate(HandlerArg*, int, int*)" at line 68 of file VBoxManageHostonly.cpp
... View more
Labels:
- Labels:
-
Apache Metron
03-24-2016
02:18 PM
1 Kudo
To re-run an installer faster add --skip-tags attribute to the ansible command like the following: ansible-playbook -i ec2.py playbook.yml --skip-tags="wait"
... View more