Member since
09-15-2015
116
Posts
141
Kudos Received
40
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1748 | 02-05-2018 04:53 PM | |
2261 | 10-16-2017 09:46 AM | |
1980 | 07-04-2017 05:52 PM | |
2957 | 04-17-2017 06:44 PM | |
2169 | 12-30-2016 11:32 AM |
02-05-2018
04:53 PM
The intention behind this is very much to move towards PCAP query within zeppelin. This script is effectively a backend to provide access to pcap query via a zeppelin interpreter. If you install the sample zeppelin notebooks you will find one demonstrating the PCAP capabilities. The notebook is used like this:
... View more
10-16-2017
01:13 PM
Note however, that Moloch will not give you any compatibility with Metron or Hadoop, so you'll need a separate Moloch cluster.
... View more
10-16-2017
09:46 AM
The pycapa probe is used to ingest PCAP, which is then pushed to Kafka. Pycapa (http://metron.apache.org/current-book/metron-sensors/pycapa/index.html) is really intended for test use and is probably good up to around 1Gbps. In a real production you want to use fastcapa (http://metron.apache.org/current-book/metron-sensors/fastcapa/index.html) which does the same job, but in an accelerated way. The PCAP metron topology then takes this and stores it on HDFS sequence files. ($METRON_HOME/bin/start_pcap_topology.sh to start this). pcap-replay is a testing tool used to feed sample pcap data to an interface which you can then listen to with pycapa or fastcapa. pcap-service was the backend for an older interface panel, which is no longer really supported. We'll take a look at get the functionality pushed in the the new metron-rest service somewhere on the roadmap, in the meantime, your best bet is to use the query and inspector tools. There are various ways of then querying the PCAP data through the cli tools documented here: http://metron.apache.org/current-book/metron-platform/metron-pcap-backend/index.html has a lot of other useful information about the way PCAP is collected in Metron. It sounds like you are using monit from your description of services. This is deprecated, please use Ambari to manage services in metron. I would also recommend using the HCP deployment if you can rather than a direct Apache build. 3 nodes is also a tiny metron cluster, so you're unlikely to be able to get the levels of performance for anything like full scape PCAP, but it should be ok for a PoC test grade environment.
... View more
07-04-2017
05:52 PM
1 Kudo
Metron uses Storm, Kafka, HDFS, Spark, Zepplein, Zookeeper and HBase primarily, which are available in other distributions. However, the primary deployment method is through Ambari, so it tends to work a lot better on HDP. All Hortonworks testing of the platform is certainly done on the HDP platform, so it will certainly be a lot easier to use. For AWS type deployments, it may also be worth considering HDC, which is essentially HDP but packaged up in the same way as EMR running on Amazon. s3 could also make sense as a long term storage platform for Metron replacing the HDFS default.
... View more
04-26-2017
12:26 PM
Updated to include HDP 2.5 (required for Metron > 0.3.0)
... View more
04-18-2017
04:35 AM
In theory, yes, however, you may want to back up and change the repo locations in Ambari as per the docs to prevent ambari overwriting the repo file. Note there is little real difference here, in terms of performance of the install. Any packages already present will just be skipped as already installed even if you start an install completely from scratch again the local repos.
... View more
04-17-2017
06:44 PM
Sounds like you may have some connectivity issues to the public repo. One way to solve this is to download separately and use a local repo. Checkout the docs at http://docs.hortonworks.com/HDPDocuments/Ambari-2.5.0.3/bk_ambari-installation/content/using_a_local_repository.html which show you how to download all the relevant tar balls to setup a local repo server.
... View more
02-20-2017
10:49 AM
@Sebastian Carroll These options will work in both yarn-client and yarn-cluster mode. What you will need to do is ensure you have an appropriate file appender in the log4j configuration. That said, if you have a job which is running for multiple days, you are far far better off using yarn-cluster mode to ensure the driver is safely located on the cluster, rather than relying on a single node with a yarn-client hooked to it.
... View more
12-30-2016
11:33 AM
Could you please confirm your ansible version? This is likely a version conflict with Ansible. We recommend 2.0.0.2
... View more
12-30-2016
11:32 AM
Metron uses a number of Hadoop ecosystem components, and so tends to require separate master nodes for these for performance, this can also be used for resilience, though this diagram does not show full master HA. To expand the abbreviations:- NN = Name Node (the Hadoop HDFS name node stores file system meta data) SN = Secondary Name Node (not very well named, but provides compaction and optimisation services for the NN) RM = Resource Manager (the container coordinator which manages YARN resources and allocates them to running jobs) ZS = Zookeeper Server (zookeeper is used extensively in Metron for storage and coordination of configuration. It is also used for similar purposes by many other Hadoop components) DN = Data Node (this is an HDFS Data Node and responsible for storing the actual blocks in HDFS)
... View more