About brosander

brosander · ‎09-27-2016

Windows Event Logs are stored and exported in the evtx format. This makes them difficult to work with in a programmatic fashion, especially outside the Windows ecosystem. In order to facilitate the processing of Windows event data in Apache NiFi, a processor capable of parsing evtx files has been created. It works using a parser ported from Python into pure Java and modified to read the file one chunk (64k) at a time to allow for processing large files without pulling the entire input FlowFile into memory. The Processor expects incoming FlowFiles with evtx files as their contents. It only requires the user to specify a desired granularity of output. This allows the user to decide how the evtx file is broken up into individual FlowFiles with the event records translated to XML. ParseEvtx has 4 output relationships. It will output the unmodified FlowFile to the original relationship. Any malformed chunks encountered will be sent to the bad chunk relationship. Regardless of the configuration, the output and failure relationship FlowFiles will consist of a wrapping xml tag with individual events inside. The grouping of events within the output and failure relationships depends on the user-configured granularity property. At File granularity, the entire evtx file will be put into a single output FlowFile. This also means that if the file is corrupt or malformed in any way, the best-effort xml from parsing it will be sent to the failure relationship. At Chunk granularity, each chunk (64k) of the evtx file will have its own FlowFile. For any chunk that is detectably malformed via a bad checksum, there will not be XML failure output. If a chunk isn’t determined to be unparseable until the attempt is underway, the XML up until the error is discovered will be transferred via the failure relationship. At Record granularity, every event will generate an output FlowFile. The failure relationship isn’t really relevant at this granularity because everything from the chunk up until the error has already been put into other FlowFiles. In order to demonstrate this functionality, there is a specially crafted evtx file in the NiFi test resources designed to show the different possible paths. First, download the Apache NiFi archive. Then, download the ParseEvtxSample.xml Use the following commands to prepare a demo workspace: mkdir parse-evtx-demo/ cd parse-evtx-demo/ unzip ~/Downloads/nifi-1.0.0-bin.zip nifi-1.0.0/bin/nifi.sh start wget -P sample-data https://github.com/apache/nifi/raw/rel/nifi-1.0.0/nifi-nar-bundles/nifi-evtx-bundle/nifi-evtx-processors/src/test/resources/application-logs.evtx Open the NiFi UI and upload the ParseEvtxSample.xml and instantiate it on the canvas. Start the flow, every 10 seconds you should see output to each relationship. In an output folder in parse-evtx-demo, you can see the contents of each different queue’s FlowFile(s). In the normal well formed file case, you should only see output to the success and original relationships; the other two are to make it possible to handle malformed or incorrectly parsed files. The sample evtx was doctored to exercise all of the processor’s relationships. You can change the granularity on the ParseEvtx processor and see how that impacts the different cases.

brosander · ‎09-22-2016

Hortonworks DataFlow 2.0 comes with the ability to configure TLS for Apache NiFi through Apache Ambari, this is implemented using the tls-toolkit in client/server mode. To demonstrate this functionality, lets set up a 3 node NiFi secured cluster through Ambari locally in Docker containers. First, install docker on your machine. Please note that the default docker-machine vm size is too small for this guide. You should have at least 8 gigs of ram, a few cpus, and 100 gigs of hard drive space allocated to the docker-machine you’re using. Build the Ambari stack: mkdir toolkit-demo-3/ cd toolkit-demo-3/ git clone https://github.com/brosander/dev-dockerfiles.git dev-dockerfiles/ambari/server/centos6/buildStack.sh http://public-repo-1.hortonworks.com/ambari/centos6/2.x/updates/2.4.0.1/ambari.repo Generate an ssh key for use talking to the gateway and Ambari: mkdir ambari-ssh-keys ssh-keygen -t rsa -b 4096 -f ambari-ssh-keys/id_rsa Run the Ambari stack and install HDF mpack: wget -P mpack http://public-repo-1.hortonworks.com/HDF/centos6/2.x/updates/2.0.0.0/tars/hdf_ambari_mp/hdf-ambari-mpack-2.0.0.0-579.tar.gz dev-dockerfiles/ambari/server/centos6/runStack.sh -m "`pwd`/mpack/" -p "`pwd`/ambari-ssh-keys/id_rsa.pub" -n 3 -a -g SSH into the gateway at port 2001 and forward local port 1025 for use as a SOCKS proxy (this ip will be where docker exposes the port which may vary based on environment): ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ambari-ssh-keys/id_rsa -p 2001 -D 1025 root@192.168.99.100 Configure your browser to use the ssh connection as a SOCKS proxy (for Firefox this is Settings -> Advanced -> Network -> Connection Settings). I prefer to use Chrome as my main browser so I use Firefox as my “Docker Browser”. Visit http://ambari:8080 in the browser using the SOCKS proxy Login to Ambari (default admin/admin) Launch Installation Wizard Name your cluster -> Next Default versions should be fine -> Next For target hosts enter: centos6[1-3].ambari Select manual registration radio -> Next Verify hosts are green -> Next Deselect Storm and Kafka for the purposes of this tutorial -> Next Click the + by NiFi twice to put it on all 3 hosts -> Next Put NiFi Certificate Authority on centos61.ambari (uncheck wherever it currently is) -> Next On Ambari Metrics tab, enter a Grafana Password, switch to NiFi tab Expand “Advanced nifi-ambari-config”, enter “Sensitive property values encryption password” Expand Advanced nifi-ambari-ssl-config Enter “CN=admin, OU=NIFI” (without quotes) into “Initial Admin Identity” Select “Enable SSL?”, “Clients need to authenticate?” checkboxes Enter NiFi CA Token value Enter the below xml into “Node Identities” Next Deploy Node Identities xml: <property name="Node Identity 1">CN=centos61.ambari, OU=NIFI</property> <property name="Node Identity 2">CN=centos62.ambari, OU=NIFI</property> <property name="Node Identity 3">CN=centos63.ambari, OU=NIFI</property> Wait for the install to finish, at this point you should have a running cluster. Generate your admin client certificate, subsituting the NiFi CA Token you entered on step 16 for YOUR_CA_TOKEN (if you get permissions errors during the docker-run, consider running build.sh passing your uid and gid in): dev-dockerfiles/nifi-toolkit/ubuntu/build.sh docker run -ti --net ambari -v "`pwd`:/opt/toolkit-output" --rm nifi-toolkit tls-toolkit.sh client -c centos61.ambari -D 'CN=admin, OU=NIFI' -p 10443 -T pkcs12 -t YOUR_CA_TOKEN Import the nifi-cert.pem into your browser as a trusted CA. Import The keystore.pkcs12 client certificate file into your browser as a client cert using the keyStorePassword in the generated config.json file Now you should be able to use the NiFi web ui links in Ambari to access your NiFi instances. Congratulations, you’ve used NiFi CA to secure a 3 node HDF cluster in Docker using Ambari!

Online	Offline
Last Visited	‎06-07-2017 05:06 PM

Member Since	‎06-20-2016 01:27 PM
Last Visited	‎06-07-2017 05:06 PM
Posts	34
Kudos received	36

Cloudera Community

Parsing evtx files with Apache NiFi

HDF 2.0 Secure 3 Node Development Cluster in Docke...