About TimothySpann

TimothySpann · ‎05-17-2018

https://www.rosehosting.com/blog/how-to-install-python-3-6-4-on-centos-7/

TimothySpann · ‎12-07-2018

Thanks for the update!

TimothySpann · ‎02-22-2018

@balalaika ok, with no code changes here is the Structured version. Structured Streaming is basically just GA in Spark 2.2 which is HDP 2.6.4 and above. It works fine, a little different from the old style. They will probably keep the old one for until 2.5 or maybe 3.0. Both styles are nice. Another option is to use Apache Beam or Streaming Analytics Manager. https://community.hortonworks.com/content/kbentry/174105/hdp-264-hdf-31-apache-spark-structured-streaming-i.html

TimothySpann · ‎02-10-2018

Apache Deep Learning 101 Series This is for people preparing to attend my talk on Deep Learning at DataWorks Summit Berling 2018 (https://dataworkssummit.com/berlin-2018/#agenda) on Thursday April 19, 2018 at 11:50AM Berlin time. You can easily run Apache MXNet on an OSX machine or a Linux workstation utilizing a Python script. I have forked the standard Apache MXNet Wine Detector Tutorial (http://mxnet.incubator.apache.org/tutorials/embedded/wine_detector.html) to read our local OSX webcam (you may need to change your OpenCV WebCam port from 0 to 1 or to 2, depending on your number of webcams and which one you want to use. I am running this on an OSX laptop connected to a monitor that has a built in webcam, so I use that one which is 1. The webcam numbering starts at 0. If you only have one, then use 0. Let's get this installed! git clone https://github.com/apache/incubator-mxnet.git The installation instructions at Apache MXNet's website (http://mxnet.incubator.apache.org/install/index.html) are amazing. Pick your platform and your style. I am doing this the simplest way on a Mac, but you can use Virtual Python Environment which may be best for you. git clone https://github.com/tspannhw/ApacheBigData101.git You will want to copy my shell script osxlocalrun.sh, inception copy and analyze.py script to your machine. If you don't have a webcam you will want to use the Centos version of the shell and Python. That one works with a static image that you supply. I am assuming you are running a recently updated Mac with 16GB of RAM or more, PIP, Brew and Python 3 installed already. If not, do that. If you have a pre-1.0 Apache MXNet, please upgrade. You will need curl and tar installed which they should be. cd incubator-mxnet mkdir images curl --header 'Host: data.mxnet.io' --header 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Firefox/45.0' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' --header 'Accept-Language: en-US,en;q=0.5' --header 'Referer: http://data.mxnet.io/models/imagenet/' --header 'Connection: keep-alive' 'http://data.mxnet.io/models/imagenet/inception-bn.tar.gz' -o 'inception-bn.tar.gz' -L tar -xvzf inception-bn.tar.gz cp Inception-BN-0126.params Inception-BN-0000.params Then brew update pip install --upgrade pip pip install --upgrade setuptools pip install mxnet==1.0.0 brew install graphviz pip install graphviz For your machine if you have two versions of Python, you may need to do pip3 and you may need to run via sudo. It depends on how your machine is setup and how locked down it is. We are creating a directory called images that will fill with OpenCV capture images. You probably want to delete them or ingest them. It's very easy to ingest with Apache NiFi or MiniFi both of which run on OSX with ease. See: https://community.hortonworks.com/articles/107379/minifi-for-image-capture-and-ingestion-from-raspbe.html So we call a simple shell script (osxlocalrun.sh), which calls our custom Python 3 script (you can easily convert this to Python 2 if you need to, in a future article I have this running on Python 2.7 on a Centos 7 HDP 2.6.4 cluster node). I send warnings to /dev/null to get rid of them since they are related to OSX configuration that you may or may not have and cannot easily change. Nothing to see here. You will probably need to chmod 755 your osxlocalrun.sh. If you are running on a Linux variant, follow this directions on the Apache MXNet site or wait for my next article on installing and using Apache MXNet in Centos-based HDP 2.6.4 and HDF 3.1 clusters. python3 -W ignore analyze.py 2>/dev/null For Apache NiFi Flow Templates You can download my Apache NiFi flows from github or this article. Architecture Local Apache NiFi 1.5 with NiFi Registry running with JDK 8 on OSX Local Apache MXNet installation with Python 3 Remote HDF 3.1 Cluster Running on Centos 7 on OpenStack with Apache Ambari, Apache NiFi, NiFi Registry, Hortonworks Schema Registry. Remote HDP 2.6.4 Cluster Runniong on Centos 7 on OpenStack with Apache Hive, Apache Ambari The flow is easy: ExecuteProcess: Execute that shell script UpdateAttribute: Add the schema name InferAvroSchema: Really need this one only once if you don't want to hand create your schema, push the results to an attribute Remote Process Group: Send via HTTP Site-to-Site to an HDF 3.1 cluster. Local OSX Processing Cluster based Record Processing On the cloud we use ConvertRecord to convert the Apache MXNet Python script generated JSON into AVRO. We merge a bunch of those together then convert that larger AVRO record to ORC. This ORC file is stored in HDFS. Apache NiFi will automatically generate Hive DDL that we can instantly execute via Apache NiFi or do manually. I do this manually in Apache Zeppelin. I could easily augment this data with weather, twitter and other REST feeds. Those have been covered in other articles I have written. I could also push the results to Kafka 1.0 for additional processing in Hortonworks Streaming Analytics Manager. I will do that a future time. Apache Hive SQL DDL CREATE EXTERNAL TABLE IF NOT EXISTS inception3 (uuid STRING, top1pct STRING, top1 STRING, top2pct STRING, top2 STRING, top3pct STRING, top3 STRING, top4pct STRING, top4 STRING, top5pct STRING, top5 STRING, imagefilename STRING, runtime STRING) STORED AS ORC LOCATION '/mxnet/local' Example Output {"uuid": "mxnet_uuid_img_20180208204131", "top1pct": "30.0999999046", "top1": "n02871525 bookshop, bookstore, bookstall", "top2pct": "23.7000003457", "top2": "n04200800 shoe shop, shoe-shop, shoe store", "top3pct": "4.80000004172", "top3": "n03141823 crutch", "top4pct": "2.89999991655", "top4": "n04370456 sweatshirt", "top5pct": "2.80000008643", "top5": "n02834397 bib", "imagefilename": "images/tx1_image_img_20180208204131.jpg", "runtime": "2"} Query Results Example OpenCV Captured Image {"top1pct": "67.6", "top5": "n03485794 handkerchief, hankie, hanky, hankey", "top4": "n04590129 window shade", "top3": "n03938244 pillow", "top2": "n04589890 window screen", "top1": "n02883205 bow tie, bow-tie, bowtie", "top2pct": "11.5", "imagefilename": "nanotie7.png", "top3pct": "4.5", "uuid": "mxnet_uuid_img_20180211161220", "top4pct": "2.8", "top5pct": "2.8", "runtime": "3.0"} My cat assists me in some Deep Learning work, so I use Apache NiFi to track him to make sure he's working and hasn't taken his tie off during office hours. I run a strict office here in the Princeton lab. Source Code https://github.com/tspannhw/ApacheBigData101/tree/master apache-mxnet-local.xml apachemxnet-local-processing.xml References: https://community.hortonworks.com/articles/155435/using-the-new-mxnet-model-server.html https://community.hortonworks.com/articles/83100/deep-learning-iot-workflows-with-raspberry-pi-mqtt.html https://community.hortonworks.com/articles/146704/edge-analytics-with-nvidia-jetson-tx1-running-apac.html http://mxnet.incubator.apache.org/ In the Series: Interfacing with MXNet Model Server Using Apache MXNet with HDF 3.1 Clusters Using Apache MXNet with HDP 2.6.4 Clusters Using Apache MXNet with Hadoop 3.0 YARN 3.0 HDP 3.0 Dockerized GPU Aware Clusters

TimothySpann · ‎04-03-2018

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_command-line-installation/content/configure_livy.html

TimothySpann · ‎02-07-2018

HDP 2.64 is not supported on Windows. Linux, especially Centos 7, is perfect. You could do some Hadoop experiments in a VM and docker on Windows (https://hortonworks.com/tutorial/sandbox-deployment-and-install-guide/) Check out docker hub https://hub.docker.com/r/hortonworks/ambari-server/ https://hub.docker.com/u/hortonworks/

TimothySpann · ‎01-29-2018

' The Onion Omega 2+ is a small IoT device that runs a simple busybox Linux and can run Micropython. This let's you run some simple application and interact with some sensors and an OLED. Onion Omega 2+ Stats 580MHz Cpu 128MB memory 32mb storage added usb 32gb usb 2 microusb b/g/n wifi 15 gpio 2 pwm 2 uart 1 i2c 1 spi 1 i2s Setting Up the Omega opkg install python-pip pip install --upgrade setuptools pip install paho-mqtt opkg install pyOledExp Upgrading pyOledExp on root from 0.4-1 to 0.5-1... Downloading http://repo.onion.io/omega2/packages/onion/pyOledExp_0.5-1_mipsel_24kc.ipk Configuring pyOledExp. mkdir /mnt/sda1 mount /dev/sda1 /mnt/sda1 ./run.sh > Initializing display > Setting display to ON > Enabling horizontal scrolling to the left > Writing '[{"ipaddress": "192.168.1.176", "endtime": "2018-01-29 00:50:39", "end": "1517187039.44"}]' to display 0 crontab -e crontab -l */1 * * * * /opt/demo/run.sh 1517187305: New connection from 192.168.1.176 on port 1883. 1517187305: New client connected from 192.168.1.176 as onion (c1, k60). 1517187305: Client onion disconnected. BusyBox v1.26.2 () built-in shell (ash) ____ _ ____ / __ \___ (_)__ ___ / __ \__ _ ___ ___ ____ _ / /_/ / _ \/ / _ \/ _ \ / /_/ / ' \/ -_) _ `/ _ `/ \____/_//_/_/\___/_//_/ \____/_/_/_/\__/\_, /\_,_/ W H A T W I L L Y O U I N V E N T ? /___/ ----------------------------------------------------- Ω-ware: 0.1.10 b160 ----------------------------------------------------- poweroff Attributes Related to MQTT Message Sent Example Flow File containing JSON Apache NiFi Flow File to Process Running MQTT on an Mac /usr/local/Cellar/mosquitto/1.4.14_2/sbin/mosquitto -c /usr/local/etc/mosquitto/mosquitto.conf 1517180449: mosquitto version 1.4.14 (build date 2017-10-22 16:34:22+0100) starting 1517180449: Config loaded from /usr/local/etc/mosquitto/mosquitto.conf. 1517180449: Opening ipv6 listen socket on port 1883. 1517180449: Opening ipv4 listen socket on port 1883. 1517180698: New connection from 127.0.0.1 on port 1883. 1517180698: New client connected from 127.0.0.1 as nififorthemqttguy (c1, k60). In our simple example we are just reading the time and IP Address of the device and format it in JSON to send as MQTT messages to an MQTT server read by Apache NiFi. This is a good framework to start with on tiny devices. With the Onion platform you can add GPS, sensors, USB devices, USB webcam and other inputs. These can easily be added to the Python script to send to Apache NiFi as JSON. Source Code https://github.com/tspannhw/onionomega-mqtt-micropython Python Script from OmegaExpansion import oledExp import paho.mqtt.client as client import time import os import datetime import math import random, string import json import sys import socket import json from time import sleep from string import Template from time import gmtime, strftime # Time start = time.time() currenttime= strftime("%Y-%m-%d %H:%M:%S",gmtime()) host = os.uname()[1] external_IP_and_port = ('198.41.0.4', 53) # a.root-servers.net socket_family = socket.AF_INET def IP_address(): try: s = socket.socket(socket_family, socket.SOCK_DGRAM) s.connect(external_IP_and_port) answer = s.getsockname() s.close() return answer[0] if answer else None except socket.error: return None ipaddress = IP_address() status = oledExp.driverInit() status = oledExp.setDisplayPower(1) status = oledExp.scroll (0, 0, 0, 8-1); endtime= strftime("%Y-%m-%d %H:%M:%S",gmtime()) end = time.time() row = [ { 'end': str(end), 'endtime': str(endtime), 'ipaddress': str(ipaddress) } ] json_string = json.dumps(row) broker="192.168.1.193" port=1883 client1= client.Client("onion") #create client object client1.connect(broker,port) #establish connection ret= client1.publish("omega",json_string) client1.disconnect() status = oledExp.write(json_string) print(status) References https://community.hortonworks.com/articles/89455/ingesting-gps-data-from-onion-omega2-devices-with.html https://github.com/tspannhw/onionomega-mqtt-micropython https://github.com/mccollam/omega https://github.com/micropython/micropython-lib/tree/master/umqtt.simple https://docs.onion.io/omega2-docs/using-oled-expansion.html#using-the-libraries-2 https://iotbytes.wordpress.com/paho-mqtt-with-python/ https://www.kickstarter.com/projects/onion/omega2-5-iot-computer-with-wi-fi-powered-by-linux

TimothySpann · ‎01-28-2018

The Matrix Creator is an interesting multiple sensor hat that fits on a Raspberry Pi 3. First step is to connect it, which is a simple snap, no soldering required. The specs are pretty impressive: Xilinx Spartan 6 XC6SLX4 FPGA Amtel Cortex-M3 ATSAM3S2 Microcontroller 8 MEMS MP34DB02 audio sensor digital microphones ST LSM9DS1 3D accelerometer, 3D gyroscope, 3d magnetometer IMU ST HTS221 capacitive digital sensor for relative humidity and temperature NXP MPL3115A2 precision pressure sensor with altimetry Silicon Labs EM358X - 2.4 GHz IEEE 802.15.4 Zigbee Sigma Designs ZM5202 - 868/908/921 MHz ZWave Vishay TSOP573 - carrier 38.0 kHz IR Receiver Vishay VEML6070 UV light sensor NXP PN512 NFC reader Everloop 35 RGBW LEDS It runs on Raspian lite and installs via: curl https://matrix-io.github.io/matrix-documentation/install.sh | sh Our Apache NiFi Flow For Processing the Three Types of Data Our Versioned Apache NiFi and MiniFi Flows We tail the three files produced by the three example Python sensors readers Both our MiniFi and Apache NiFi flows are very simple and documented above. Tail data from files as Python writes it, send from MiniFi to Apache NiFi which separate the files into different flows for future processing. We could create schemas, convert to JSON, merge the feeds with JSON, store them in three different data stores or more depending on what you want to do. This can be one on the edge or in Apache NiFi on a cluster. You could have MiniFi or NiFi trigger off specific values or ranges as they need arises. Or like me, you can just store it for later use in your endless HDFS Data Lake. Using Three Existing Examples Getting Temperature, UV and IMU Values. python /home/pi/matrix-creator-malos/src/python_test/test_humidity.py nohup ./humidity.sh & fh = open("/opt/demo/logs/humidity.log", "a") fh.writelines('{0}'.format(humidity_info)) fh.close python /home/pi/matrix-creator-malos/src/python_test/test_uv.py /opt/demo/logs/uv.log python /home/pi/matrix-creator-malos/src/python_test/test_imu.py /opt/demo/logs/imu.log /Volumes/seagate/Apps/minifi-toolkit-0.3.0/bin/config.sh transform $1 config.yml scp config.yml pi@192.168.1.197:/opt/demo/minifi-0.3.0/conf Example Data imu.2603753-2604002.log yaw: 141.655654907 roll: 1.66126561165 accel_x: -0.0261840820312 accel_y: 0.0283813476562 accel_z: 0.978576660156 gyro_x: -0.0672912597656 gyro_y: 2.06359863281 gyro_z: 1.33087158203 mag_x: 0.23982000351 mag_y: 0.189700007439 mag_z: -0.480480015278 uv.172512-172528.log oms_risk: "Low" humidity.29015-29074.log temperature: 21.9526348114 temperature_is_calibrated: true References: https://medium.com/kkbankol-events/raspberry-pi-15662c3ca881 https://creator.matrix.one/#!/examples https://github.com/matrix-io/matrix-creator-malos/blob/master/docs/pressure.md http://community.matrix.one/t/how-to-record-with-pyaudio/357 https://matrix-io.github.io/matrix-documentation/matrix-core/examples/pytests/ https://github.com/matrix-io/matrix-creator-alexa-voice-services https://matrix-io.github.io/matrix-documentation/matrix-hal/getting-started/installation/ https://matrix-io.github.io/matrix-documentation/matrix-core/examples/pytests/ https://matrix-io.github.io/matrix-documentation/setup/ https://www.matrix.one/products/creator

TimothySpann · ‎01-28-2018

So I found another low-end affordable platform from China for running MiniFi goodness. For this 512MB RAM machine, I decided to use MiniFi CPP. git clone https://github.com/apache/nifi-minifi-cpp.git apt-get install cmake gcc g++ bison flex libcurl-dev librocksdb-dev librocksdb4.1 uuid-dev uuid libboost-all-dev libssl-dev libbz2-dev liblzma-dev doxygen -y apt-get install -y libleveldb-dev apt-get install -y libxml2 apt-get install libpython3-dev -y apt-get install liblua5.1-0-dev -y apt-get install libusb-1.0.0-0-dev libpng12-dev -y apt-get install docker.io python-virtualenv -y apt-get install libpython3-dev -y apt-get install libgps-dev -y apt-get install libpcap-dev -y apt-get install cmake gcc g++ bison flex -y ./bootstrap.sh interactive UI cd nifi-minifi-cpp-0.3.0-source/ mkdir build cd build cmake .. make make package apt-get install libssl-dev -- The following features have been enabled: * EXPRESSION LANGUAGE EXTENSIONS , This enables NiFi expression language * HTTP CURL , This enables RESTProtocol, InvokeHTTP, and the HTTPClient for Site to Site * ROCKSDB REPOS , This Enables persistent provenance, flowfile, and content repositories using RocksDB * ARCHIVE EXTENSIONS , This Enables libarchive functionality including MergeContent, CompressContent, (Un)FocusArchiveEntry and ManipulateArchive. * SCRIPTING EXTENSIONS , This enables scripting -- The following OPTIONAL packages have been found: * LibRt * Git * BZip2 * LibLZMA * EXPAT * Boost * Doxygen -- The following REQUIRED packages have been found: * BISON * FLEX * CURL * Threads * PythonLibs * ZLIB * UUID * OpenSSL -- The following features have been disabled: * CIVETWEB , This enables ListenHTTP -- The following OPTIONAL packages have not been found: * WinSock * RocksDB * LibArchive * Nettle * LibXml2 This will take some library installs for development and various libraries needed for networking, security and devices. From the interactive UI, I selected to add most of the goodies except a USB camera since I don't have one on this tiny machine. The CPP is different (and much smaller) than the Java one. For first, you have a specific set of processors, some of which are pretty cool like ones for USB Camera image ingestion, TensorFlow processing and other device goodness. You can browse the list in the PROCESSORs link below. Since I built mine from the git clone of the master, I am running the 0.4.0 branch. I could not install RocksDB on OrangePi There's some cool stuff for reporting status. root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0# bin/minifi minifi minificontroller minifi.sh root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0# bin/minifi.sh start Starting MiNiFi with PID 15831 and pid file /opt/demo/nifi-minifi-cpp-0.4.0/bin/.minifi.pid root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0# [2018-01-infol 16:24:17.591] [main] [info] Using MINIFI_HOME=/opt/demo/nifi-minifi-cpp-0.4.0 from environment. [2018-01-infol 16:24:17.592] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi-log.properties [2018-01-26 16:24:17.893] [main] [info] Loading FlowController [2018-01-26 16:24:17.893] [org::apache::nifi::minifi::FlowController] [info] Load Flow Controller from file /opt/demo/nifi-minifi-cpp-0.4.0/conf/config.yml [2018-01-26 16:24:17.895] [org::apache::nifi::minifi::FlowController] [info] Loaded root processor Group [2018-01-26 16:24:17.895] [org::apache::nifi::minifi::FlowController] [info] Initializing timers [2018-01-26 16:24:17.896] [org::apache::nifi::minifi::FlowController] [info] Loaded controller service provider [2018-01-26 16:24:17.896] [org::apache::nifi::minifi::FlowController] [info] Loaded flow repository [2018-01-26 16:24:17.896] [org::apache::nifi::minifi::FlowController] [info] Starting Flow Controller [2018-01-26 16:24:17.898] [org::apache::nifi::minifi::core::controller::StandardControllerServiceProvider] [info] Enabling % controller services [2018-01-26 16:24:17.899] [org::apache::nifi::minifi::c2::C2Agent] [info] Class is RESTSender [2018-01-26 16:24:17.902] [org::apache::nifi::minifi::io::Socket] [error] Could not bind to socket [2018-01-26 16:24:17.903] [org::apache::nifi::minifi::FlowController] [info] Started Flow Controller [2018-01-26 16:24:17.903] [main] [info] MiNiFi started root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0/bin# ./minificontroller --list components [2018-01-infol 16:25:16.461] [controller] [info] MINIFI_HOME is not set; determining based on environment. [2018-01-infol 16:25:16.462] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi.properties [2018-01-infol 16:25:16.463] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi-log.properties Components: FlowController root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0/bin# ./minificontroller --list connections [2018-01-infol 16:25:32.850] [controller] [info] MINIFI_HOME is not set; determining based on environment. [2018-01-infol 16:25:32.851] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi.properties [2018-01-infol 16:25:32.852] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi-log.properties Connection Names: ./minificontroller --updateflow "config yml" root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0/bin# ./minificontroller --getfull [2018-01-infol 16:26:13.296] [controller] [info] MINIFI_HOME is not set; determining based on environment. [2018-01-infol 16:26:13.297] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi.properties [2018-01-infol 16:26:13.298] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi-log.properties 0 are full References https://github.com/apache/nifi-minifi-cpp https://cwiki.apache.org/confluence/display/MINIFI/C2+Design+Proposal https://github.com/apache/nifi-minifi-cpp/blob/master/examples/BidirectionalSiteToSite/README.md https://nifi.apache.org/minifi/getting-started.html https://github.com/apache/nifi-minifi-cpp/blob/master/PROCESSORS.md# https://github.com/apache/nifi-minifi-cpp/blob/master/EXPRESSIONS.md https://cwiki.apache.org/confluence/display/MINIFI/Release+Notes#ReleaseNotes-Versioncpp-0.3.0 https://github.com/apache/nifi-minifi-cpp/blob/master/Extensions.md To Customize C++ Builds https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74685143 Build Extensions with MiniFi C++ https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74685988 MiniFi C++ System properties https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70256416

TimothySpann · ‎01-28-2018

Using SiteToSiteProvenanceReportingTask to Send Provenance to Apache NiFi for Processing. Eating our own provenance food! It's almost comically easy to do this. You set up a task on the server you are reporting on that sends the data to your receiver. That other server you make a simple flow to ingest and process that. I stored it to HBase as JSON as it's a good place to put a lot of data fast. Send The Data You need to create a SiteToSiteProvenanceReportingTask in Controller Settings - Reporting Tasks. It's pretty simple. Set the values above with your destination NiFi server and a port name that you have created already. Receive the Data and Process An Individual JSON Record Split the JSON into Records $.[*] Save to HBase (PutHBaseJSON) First I have to create a table. hbase shell HBase Shell; enter 'help<RETURN>' for list of supported commands. Type "exit<RETURN>" to leave the HBase Shell Version 1.1.2.2.6.2.0-205, r5210d2ed88d7e241646beab51e9ac147a973bdcc, Sat Aug 26 09:33:50 UTC 2017 hbase(main):001:0> create 'PROVENANCE', 'event' 0 row(s) in 2.9900 seconds => Hbase::Table - PROVENANCE scan 'PROVENANCE' ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousAttributes, timestamp=1517159115042, value={"path":"./","filename":"humidity.583225-583284.log","s2s.address":"192.168.1.197:55032","s2s.host":"1 92.168.1.197","mime.type":"text/plain","uuid":"9006a1bb-d755-4272-b8d3-76e666c2a7c6","tailfile.original.path":"/opt/demo/logs/humidity.log"} ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousContentURI, timestamp=1517159115042, value=http://192.168.1.193:8080/nifi-api/provenance-events/61825/content/input ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousEntitySize, timestamp=1517159115042, value=59 ff91e204-05b0-48aa-a666-7942e3f109ab column=event:processGroupId, timestamp=1517159115042, value=01611005-4e82-1491-ae5d-ca64f59491cb ff91e204-05b0-48aa-a666-7942e3f109ab column=event:processGroupName, timestamp=1517159115042, value=Process MiniFi Creator ff91e204-05b0-48aa-a666-7942e3f109ab column=event:timestamp, timestamp=1517159115042, value=2018-01-28T00:25:30.616Z ff91e204-05b0-48aa-a666-7942e3f109ab column=event:timestampMillis, timestamp=1517159115042, value=1517099130616 ff91e204-05b0-48aa-a666-7942e3f109ab column=event:updatedAttributes, timestamp=1517159115042, value={"RouteOnAttribute.Route":"humidity"} ffde140c-3053-4b9d-89c6-14b68025384d column=event:actorHostname, timestamp=1517159114898, value=192.168.1.193 ffde140c-3053-4b9d-89c6-14b68025384d column=event:application, timestamp=1517159114898, value=NiFi Flow ffde140c-3053-4b9d-89c6-14b68025384d column=event:childIds, timestamp=1517159114898, value=[] ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentId, timestamp=1517159114898, value=3a25cda9-0161-1000-813c-631724a10585 ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentName, timestamp=1517159114898, value=RouteOnAttribute ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentType, timestamp=1517159114898, value=RouteOnAttribute ffde140c-3053-4b9d-89c6-14b68025384d column=event:contentURI, timestamp=1517159114898, value=http://192.168.1.193:8080/nifi-api/provenance-events/61701/content/output ffde140c-3053-4b9d-89c6-14b68025384d column=event:durationMillis, timestamp=1517159114898, value=-1 ffde140c-3053-4b9d-89c6-14b68025384d column=event:entityId, timestamp=1517159114898, value=9b017666-7ce9-45c5-9d0a-2f81e56d6fa8 ffde140c-3053-4b9d-89c6-14b68025384d column=event:entitySize, timestamp=1517159114898, value=16 ffde140c-3053-4b9d-89c6-14b68025384d column=event:entityType, timestamp=1517159114898, value=org.apache.nifi.flowfile.FlowFile ffde140c-3053-4b9d-89c6-14b68025384d column=event:eventOrdinal, timestamp=1517159114898, value=61701 ffde140c-3053-4b9d-89c6-14b68025384d column=event:eventType, timestamp=1517159114898, value=ROUTE ffde140c-3053-4b9d-89c6-14b68025384d column=event:lineageStart, timestamp=1517159114898, value=1517084974341 ffde140c-3053-4b9d-89c6-14b68025384d column=event:parentIds, timestamp=1517159114898, value=[] ffde140c-3053-4b9d-89c6-14b68025384d column=event:platform, timestamp=1517159114898, value=nifi ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousAttributes, timestamp=1517159114898, value={"path":"./","filename":"uv.164064-164080.log","s2s.address":"192.168.1.197:55032","s2s.host":"192.168 .1.197","mime.type":"text/plain","uuid":"9b017666-7ce9-45c5-9d0a-2f81e56d6fa8","tailfile.original.path":"/opt/demo/logs/uv.log"} ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousContentURI, timestamp=1517159114898, value=http://192.168.1.193:8080/nifi-api/provenance-events/61701/content/input ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousEntitySize, timestamp=1517159114898, value=16 ffde140c-3053-4b9d-89c6-14b68025384d column=event:processGroupId, timestamp=1517159114898, value=01611005-4e82-1491-ae5d-ca64f59491cb ffde140c-3053-4b9d-89c6-14b68025384d column=event:processGroupName, timestamp=1517159114898, value=Process MiniFi Creator ffde140c-3053-4b9d-89c6-14b68025384d column=event:timestamp, timestamp=1517159114898, value=2018-01-28T00:25:30.607Z ffde140c-3053-4b9d-89c6-14b68025384d column=event:timestampMillis, timestamp=1517159114898, value=1517099130607 ffde140c-3053-4b9d-89c6-14b68025384d column=event:updatedAttributes, timestamp=1517159114898, value={"RouteOnAttribute.Route":"uv"} 1830 row(s) in 11.7680 seconds provenancereporting.xml Learning to Use HBase https://hortonworks.com/hadoop-tutorial/introduction-apache-hbase-concepts-apache-phoenix-new-backup-restore-utility-hbase/

Online	Offline
Last Visited	‎05-20-2024 05:42 PM

Member Since	‎01-07-2019 11:58 AM
Last Visited	‎05-20-2024 05:42 PM
Posts	1,973
Kudos received	1122

Cloudera Community

Re: Has anyone tried NiFi consuming (JMSConsume) f...

Re: NiFi Crash after runing chain of lookups

Re: Recommend approach for listening to RSS Feed i...

Re: NiFi ListenFTP Processor Default Data Port

Re: Nifi: Kafka Producer with Avro format in both ...

Re: Apache Deep Learning 101: Using Apache MXNet o...

Re: HDP 2.6.4 - HDF 3.1: Apache Kafka - Apache Spa...

Re: HDP 2.6.4 - HDF 3.1: Apache Spark Streaming ...

Apache Deep Learning 101: Using Apache MXNet on a...

Re: HDF 3.1: Executing Apache Spark via ExecuteSpa...

Re: Can we install HDP in windows machine together...

Sending Messages and Displaying Them on an OLED Sc...

Ingesting Data from the Matrix Creator with MiniFi

Building and Running MiniFi CPP in OrangePi Zero

Provenance Site to Site Reporting - via Apache NiF...