1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1843 | 04-03-2024 06:39 AM | |
| 2878 | 01-12-2024 08:19 AM | |
| 1585 | 12-07-2023 01:49 PM | |
| 2349 | 08-02-2023 07:30 AM | |
| 3241 | 03-29-2023 01:22 PM |
05-17-2018
01:19 PM
https://www.rosehosting.com/blog/how-to-install-python-3-6-4-on-centos-7/
... View more
12-07-2018
04:59 PM
Thanks for the update!
... View more
02-22-2018
03:05 PM
@balalaika ok, with no code changes here is the Structured version. Structured Streaming is basically just GA in Spark 2.2 which is HDP 2.6.4 and above. It works fine, a little different from the old style. They will probably keep the old one for until 2.5 or maybe 3.0. Both styles are nice. Another option is to use Apache Beam or Streaming Analytics Manager. https://community.hortonworks.com/content/kbentry/174105/hdp-264-hdf-31-apache-spark-structured-streaming-i.html
... View more
02-10-2018
07:51 PM
2 Kudos
Apache Deep Learning 101 Series This is for people preparing to attend my talk on Deep Learning at DataWorks Summit Berling 2018 (https://dataworkssummit.com/berlin-2018/#agenda) on Thursday April 19, 2018 at 11:50AM Berlin time. You can easily run Apache MXNet on an OSX machine or a Linux workstation utilizing a Python script. I have forked the standard Apache MXNet Wine Detector Tutorial (http://mxnet.incubator.apache.org/tutorials/embedded/wine_detector.html) to read our local OSX webcam (you may need to change your OpenCV WebCam port from 0 to 1 or to 2, depending on your number of webcams and which one you want to use. I am running this on an OSX laptop connected to a monitor that has a built in webcam, so I use that one which is 1. The webcam numbering starts at 0. If you only have one, then use 0. Let's get this installed! git clone https://github.com/apache/incubator-mxnet.git The installation instructions at Apache MXNet's website (http://mxnet.incubator.apache.org/install/index.html) are amazing. Pick your platform and your style. I am doing this the simplest way on a Mac, but you can use Virtual Python Environment which may be best for you. git clone https://github.com/tspannhw/ApacheBigData101.git You will want to copy my shell script osxlocalrun.sh, inception copy and analyze.py script to your machine. If you don't have a webcam you will want to use the Centos version of the shell and Python. That one works with a static image that you supply. I am assuming you are running a recently updated Mac with 16GB of RAM or more, PIP, Brew and Python 3 installed already. If not, do that. If you have a pre-1.0 Apache MXNet, please upgrade. You will need curl and tar installed which they should be. cd incubator-mxnet
mkdir images
curl --header 'Host: data.mxnet.io' --header 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Firefox/45.0' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' --header 'Accept-Language: en-US,en;q=0.5' --header 'Referer: http://data.mxnet.io/models/imagenet/' --header 'Connection: keep-alive' 'http://data.mxnet.io/models/imagenet/inception-bn.tar.gz' -o 'inception-bn.tar.gz' -L
tar -xvzf inception-bn.tar.gz
cp Inception-BN-0126.params Inception-BN-0000.params Then brew update
pip install --upgrade pip
pip install --upgrade setuptools
pip install mxnet==1.0.0
brew install graphviz
pip install graphviz For your machine if you have two versions of Python, you may need to do pip3 and you may need to run via sudo. It depends on how your machine is setup and how locked down it is. We are creating a directory called images that will fill with OpenCV capture images. You probably want to delete them or ingest them. It's very easy to ingest with Apache NiFi or MiniFi both of which run on OSX with ease. See: https://community.hortonworks.com/articles/107379/minifi-for-image-capture-and-ingestion-from-raspbe.html So we call a simple shell script (osxlocalrun.sh), which calls our custom Python 3 script (you can easily convert this to Python 2 if you need to, in a future article I have this running on Python 2.7 on a Centos 7 HDP 2.6.4 cluster node). I send warnings to /dev/null to get rid of them since they are related to OSX configuration that you may or may not have and cannot easily change. Nothing to see here. You will probably need to chmod 755 your osxlocalrun.sh. If you are running on a Linux variant, follow this directions on the Apache MXNet site or wait for my next article on installing and using Apache MXNet in Centos-based HDP 2.6.4 and HDF 3.1 clusters. python3 -W ignore analyze.py 2>/dev/null For Apache NiFi Flow Templates You can download my Apache NiFi flows from github or this article. Architecture Local Apache NiFi 1.5 with NiFi Registry running with JDK 8 on OSX Local Apache MXNet installation with Python 3 Remote HDF 3.1 Cluster Running on Centos 7 on OpenStack with Apache Ambari, Apache NiFi, NiFi Registry, Hortonworks Schema Registry. Remote HDP 2.6.4 Cluster Runniong on Centos 7 on OpenStack with Apache Hive, Apache Ambari The flow is easy: ExecuteProcess: Execute that shell script UpdateAttribute: Add the schema name InferAvroSchema: Really need this one only once if you don't want to hand create your schema, push the results to an attribute Remote Process Group: Send via HTTP Site-to-Site to an HDF 3.1 cluster. Local OSX Processing Cluster based Record Processing On the cloud we use ConvertRecord to convert the Apache MXNet Python script generated JSON into AVRO. We merge a bunch of those together then convert that larger AVRO record to ORC. This ORC file is stored in HDFS. Apache NiFi will automatically generate Hive DDL that we can instantly execute via Apache NiFi or do manually. I do this manually in Apache Zeppelin. I could easily augment this data with weather, twitter and other REST feeds. Those have been covered in other articles I have written. I could also push the results to Kafka 1.0 for additional processing in Hortonworks Streaming Analytics Manager. I will do that a future time. Apache Hive SQL DDL CREATE EXTERNAL TABLE IF NOT EXISTS inception3 (uuid STRING, top1pct STRING, top1 STRING, top2pct STRING, top2 STRING, top3pct STRING, top3 STRING, top4pct STRING, top4 STRING, top5pct STRING, top5 STRING, imagefilename STRING, runtime STRING) STORED AS ORC
LOCATION '/mxnet/local' Example Output {"uuid":
"mxnet_uuid_img_20180208204131", "top1pct":
"30.0999999046", "top1": "n02871525 bookshop,
bookstore, bookstall", "top2pct": "23.7000003457",
"top2": "n04200800 shoe shop, shoe-shop, shoe store",
"top3pct": "4.80000004172", "top3":
"n03141823 crutch", "top4pct": "2.89999991655",
"top4": "n04370456 sweatshirt", "top5pct":
"2.80000008643", "top5": "n02834397 bib", "imagefilename":
"images/tx1_image_img_20180208204131.jpg", "runtime":
"2"} Query Results Example OpenCV Captured Image {"top1pct": "67.6", "top5": "n03485794 handkerchief, hankie, hanky, hankey", "top4": "n04590129 window shade", "top3": "n03938244 pillow", "top2": "n04589890 window screen", "top1": "n02883205 bow tie, bow-tie, bowtie", "top2pct": "11.5", "imagefilename": "nanotie7.png", "top3pct": "4.5", "uuid": "mxnet_uuid_img_20180211161220", "top4pct": "2.8", "top5pct": "2.8", "runtime": "3.0"} My cat assists me in some Deep Learning work, so I use Apache NiFi to track him to make sure he's working and hasn't taken his tie off during office hours. I run a strict office here in the Princeton lab. Source Code https://github.com/tspannhw/ApacheBigData101/tree/master apache-mxnet-local.xml apachemxnet-local-processing.xml References: https://community.hortonworks.com/articles/155435/using-the-new-mxnet-model-server.html https://community.hortonworks.com/articles/83100/deep-learning-iot-workflows-with-raspberry-pi-mqtt.html https://community.hortonworks.com/articles/146704/edge-analytics-with-nvidia-jetson-tx1-running-apac.html http://mxnet.incubator.apache.org/ In the Series: Interfacing with MXNet Model Server Using Apache MXNet with HDF 3.1 Clusters Using Apache MXNet with HDP 2.6.4 Clusters Using Apache MXNet with Hadoop 3.0 YARN 3.0 HDP 3.0 Dockerized GPU Aware Clusters
... View more
Labels:
04-03-2018
05:36 PM
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_command-line-installation/content/configure_livy.html
... View more
02-07-2018
04:15 PM
HDP 2.64 is not supported on Windows. Linux, especially Centos 7, is perfect. You could do some Hadoop experiments in a VM and docker on Windows (https://hortonworks.com/tutorial/sandbox-deployment-and-install-guide/) Check out docker hub https://hub.docker.com/r/hortonworks/ambari-server/ https://hub.docker.com/u/hortonworks/
... View more
01-29-2018
01:00 AM
2 Kudos
'
The Onion Omega 2+ is a small IoT device that runs a simple busybox Linux and can run Micropython. This let's you run some simple application and interact with some sensors and an OLED.
Onion Omega 2+ Stats
580MHz Cpu
128MB memory
32mb storage
added usb 32gb
usb 2
microusb
b/g/n wifi
15 gpio
2 pwm
2 uart
1 i2c
1 spi
1 i2s
Setting Up the Omega
opkg install python-pip
pip install --upgrade setuptools
pip install paho-mqtt
opkg install pyOledExp
Upgrading pyOledExp on root from 0.4-1 to 0.5-1...
Downloading http://repo.onion.io/omega2/packages/onion/pyOledExp_0.5-1_mipsel_24kc.ipk
Configuring pyOledExp.
mkdir /mnt/sda1
mount /dev/sda1 /mnt/sda1
./run.sh
> Initializing display
> Setting display to ON
> Enabling horizontal scrolling to the left
> Writing '[{"ipaddress": "192.168.1.176", "endtime": "2018-01-29 00:50:39", "end": "1517187039.44"}]' to display
0
crontab -e
crontab -l
*/1 * * * * /opt/demo/run.sh
1517187305: New connection from 192.168.1.176 on port 1883.
1517187305: New client connected from 192.168.1.176 as onion (c1, k60).
1517187305: Client onion disconnected.
BusyBox v1.26.2 () built-in shell (ash)
____ _ ____
/ __ \___ (_)__ ___ / __ \__ _ ___ ___ ____ _
/ /_/ / _ \/ / _ \/ _ \ / /_/ / ' \/ -_) _ `/ _ `/
\____/_//_/_/\___/_//_/ \____/_/_/_/\__/\_, /\_,_/
W H A T W I L L Y O U I N V E N T ? /___/
-----------------------------------------------------
Ω-ware: 0.1.10 b160
-----------------------------------------------------
poweroff
Attributes Related to MQTT Message Sent
Example Flow File containing JSON
Apache NiFi Flow File to Process
Running MQTT on an Mac
/usr/local/Cellar/mosquitto/1.4.14_2/sbin/mosquitto -c /usr/local/etc/mosquitto/mosquitto.conf
1517180449: mosquitto version 1.4.14 (build date 2017-10-22 16:34:22+0100) starting
1517180449: Config loaded from /usr/local/etc/mosquitto/mosquitto.conf.
1517180449: Opening ipv6 listen socket on port 1883.
1517180449: Opening ipv4 listen socket on port 1883.
1517180698: New connection from 127.0.0.1 on port 1883.
1517180698: New client connected from 127.0.0.1 as nififorthemqttguy (c1, k60).
In our simple example we are just reading the time and IP Address of the device and format it in JSON to send as MQTT messages to an MQTT server read by Apache NiFi. This is a good framework to start with on tiny devices. With the Onion platform you can add GPS, sensors, USB devices, USB webcam and other inputs. These can easily be added to the Python script to send to Apache NiFi as JSON.
Source Code
https://github.com/tspannhw/onionomega-mqtt-micropython Python Script from OmegaExpansion import oledExp
import paho.mqtt.client as client
import time
import os
import datetime
import math
import random, string
import json
import sys
import socket
import json
from time import sleep
from string import Template
from time import gmtime, strftime
# Time
start = time.time()
currenttime= strftime("%Y-%m-%d %H:%M:%S",gmtime())
host = os.uname()[1]
external_IP_and_port = ('198.41.0.4', 53) # a.root-servers.net
socket_family = socket.AF_INET
def IP_address():
try:
s = socket.socket(socket_family, socket.SOCK_DGRAM)
s.connect(external_IP_and_port)
answer = s.getsockname()
s.close()
return answer[0] if answer else None
except socket.error:
return None
ipaddress = IP_address()
status = oledExp.driverInit()
status = oledExp.setDisplayPower(1)
status = oledExp.scroll (0, 0, 0, 8-1);
endtime= strftime("%Y-%m-%d %H:%M:%S",gmtime())
end = time.time()
row = [ { 'end': str(end), 'endtime': str(endtime), 'ipaddress': str(ipaddress) } ]
json_string = json.dumps(row)
broker="192.168.1.193"
port=1883
client1= client.Client("onion") #create client object
client1.connect(broker,port) #establish connection
ret= client1.publish("omega",json_string)
client1.disconnect()
status = oledExp.write(json_string)
print(status)
References https://community.hortonworks.com/articles/89455/ingesting-gps-data-from-onion-omega2-devices-with.html https://github.com/tspannhw/onionomega-mqtt-micropython https://github.com/mccollam/omega https://github.com/micropython/micropython-lib/tree/master/umqtt.simple https://docs.onion.io/omega2-docs/using-oled-expansion.html#using-the-libraries-2 https://iotbytes.wordpress.com/paho-mqtt-with-python/ https://www.kickstarter.com/projects/onion/omega2-5-iot-computer-with-wi-fi-powered-by-linux
... View more
Labels:
01-28-2018
06:31 PM
2 Kudos
The Matrix Creator is an interesting multiple sensor hat that fits on a Raspberry Pi 3. First step is to connect it, which is a simple snap, no soldering required. The specs are pretty impressive: Xilinx Spartan 6 XC6SLX4 FPGA Amtel Cortex-M3 ATSAM3S2 Microcontroller 8 MEMS MP34DB02 audio sensor digital microphones ST LSM9DS1 3D accelerometer, 3D gyroscope, 3d magnetometer IMU ST HTS221 capacitive digital sensor for relative humidity and temperature NXP MPL3115A2 precision pressure sensor with altimetry Silicon Labs EM358X - 2.4 GHz IEEE 802.15.4 Zigbee Sigma Designs ZM5202 - 868/908/921 MHz ZWave Vishay TSOP573 - carrier 38.0 kHz IR Receiver Vishay VEML6070 UV light sensor NXP PN512 NFC reader Everloop 35 RGBW LEDS It runs on Raspian lite and installs via: curl https://matrix-io.github.io/matrix-documentation/install.sh | sh Our Apache NiFi Flow For Processing the Three Types of Data Our Versioned Apache NiFi and MiniFi Flows We tail the three files produced by the three example Python sensors readers Both our MiniFi and Apache NiFi flows are very simple and documented above. Tail data from files as Python writes it, send from MiniFi to Apache NiFi which separate the files into different flows for future processing. We could create schemas, convert to JSON, merge the feeds with JSON, store them in three different data stores or more depending on what you want to do. This can be one on the edge or in Apache NiFi on a cluster. You could have MiniFi or NiFi trigger off specific values or ranges as they need arises. Or like me, you can just store it for later use in your endless HDFS Data Lake. Using Three Existing Examples Getting Temperature, UV and IMU Values. python /home/pi/matrix-creator-malos/src/python_test/test_humidity.py
nohup ./humidity.sh &
fh = open("/opt/demo/logs/humidity.log", "a")
fh.writelines('{0}'.format(humidity_info))
fh.close
python /home/pi/matrix-creator-malos/src/python_test/test_uv.py
/opt/demo/logs/uv.log
python /home/pi/matrix-creator-malos/src/python_test/test_imu.py
/opt/demo/logs/imu.log
/Volumes/seagate/Apps/minifi-toolkit-0.3.0/bin/config.sh transform $1 config.yml
scp config.yml pi@192.168.1.197:/opt/demo/minifi-0.3.0/conf Example Data imu.2603753-2604002.log
yaw: 141.655654907
roll: 1.66126561165
accel_x: -0.0261840820312
accel_y: 0.0283813476562
accel_z: 0.978576660156
gyro_x: -0.0672912597656
gyro_y: 2.06359863281
gyro_z: 1.33087158203
mag_x: 0.23982000351
mag_y: 0.189700007439
mag_z: -0.480480015278
uv.172512-172528.log
oms_risk: "Low"
humidity.29015-29074.log
temperature: 21.9526348114
temperature_is_calibrated: true References:
https://medium.com/kkbankol-events/raspberry-pi-15662c3ca881 https://creator.matrix.one/#!/examples https://github.com/matrix-io/matrix-creator-malos/blob/master/docs/pressure.md http://community.matrix.one/t/how-to-record-with-pyaudio/357 https://matrix-io.github.io/matrix-documentation/matrix-core/examples/pytests/ https://github.com/matrix-io/matrix-creator-alexa-voice-services https://matrix-io.github.io/matrix-documentation/matrix-hal/getting-started/installation/ https://matrix-io.github.io/matrix-documentation/matrix-core/examples/pytests/ https://matrix-io.github.io/matrix-documentation/setup/ https://www.matrix.one/products/creator
... View more
Labels:
01-28-2018
05:30 PM
2 Kudos
So I found another low-end affordable platform from China for running MiniFi goodness. For this 512MB RAM machine, I decided to use MiniFi CPP.
git clone https://github.com/apache/nifi-minifi-cpp.git
apt-get install cmake gcc g++ bison flex libcurl-dev librocksdb-dev librocksdb4.1 uuid-dev uuid libboost-all-dev libssl-dev libbz2-dev liblzma-dev doxygen -y
apt-get install -y libleveldb-dev
apt-get install -y libxml2
apt-get install libpython3-dev -y
apt-get install liblua5.1-0-dev -y
apt-get install libusb-1.0.0-0-dev libpng12-dev -y
apt-get install docker.io python-virtualenv -y
apt-get install libpython3-dev -y
apt-get install libgps-dev -y
apt-get install libpcap-dev -y
apt-get install cmake gcc g++ bison flex -y
./bootstrap.sh
interactive UI
cd nifi-minifi-cpp-0.3.0-source/
mkdir build
cd build
cmake ..
make
make package
apt-get install libssl-dev
-- The following features have been enabled:
* EXPRESSION LANGUAGE EXTENSIONS , This enables NiFi expression language
* HTTP CURL , This enables RESTProtocol, InvokeHTTP, and the HTTPClient for Site to Site
* ROCKSDB REPOS , This Enables persistent provenance, flowfile, and content repositories using RocksDB
* ARCHIVE EXTENSIONS , This Enables libarchive functionality including MergeContent, CompressContent, (Un)FocusArchiveEntry and ManipulateArchive.
* SCRIPTING EXTENSIONS , This enables scripting
-- The following OPTIONAL packages have been found:
* LibRt
* Git
* BZip2
* LibLZMA
* EXPAT
* Boost
* Doxygen
-- The following REQUIRED packages have been found:
* BISON
* FLEX
* CURL
* Threads
* PythonLibs
* ZLIB
* UUID
* OpenSSL
-- The following features have been disabled:
* CIVETWEB , This enables ListenHTTP
-- The following OPTIONAL packages have not been found:
* WinSock
* RocksDB
* LibArchive
* Nettle
* LibXml2
This will take some library installs for development and various libraries needed for networking, security and devices.
From the interactive UI, I selected to add most of the goodies except a USB camera since I don't have one on this tiny machine.
The CPP is different (and much smaller) than the Java one. For first, you have a specific set of processors, some of which are pretty cool like ones for USB Camera image ingestion, TensorFlow processing and other device goodness. You can browse the list in the PROCESSORs link below.
Since I built mine from the git clone of the master, I am running the 0.4.0 branch.
I could not install RocksDB on OrangePi
There's some cool stuff for reporting status.
root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0# bin/minifi
minifi minificontroller minifi.sh
root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0# bin/minifi.sh start
Starting MiNiFi with PID 15831 and pid file /opt/demo/nifi-minifi-cpp-0.4.0/bin/.minifi.pid
root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0# [2018-01-infol 16:24:17.591] [main] [info] Using MINIFI_HOME=/opt/demo/nifi-minifi-cpp-0.4.0 from environment.
[2018-01-infol 16:24:17.592] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi-log.properties
[2018-01-26 16:24:17.893] [main] [info] Loading FlowController
[2018-01-26 16:24:17.893] [org::apache::nifi::minifi::FlowController] [info] Load Flow Controller from file /opt/demo/nifi-minifi-cpp-0.4.0/conf/config.yml
[2018-01-26 16:24:17.895] [org::apache::nifi::minifi::FlowController] [info] Loaded root processor Group
[2018-01-26 16:24:17.895] [org::apache::nifi::minifi::FlowController] [info] Initializing timers
[2018-01-26 16:24:17.896] [org::apache::nifi::minifi::FlowController] [info] Loaded controller service provider
[2018-01-26 16:24:17.896] [org::apache::nifi::minifi::FlowController] [info] Loaded flow repository
[2018-01-26 16:24:17.896] [org::apache::nifi::minifi::FlowController] [info] Starting Flow Controller
[2018-01-26 16:24:17.898] [org::apache::nifi::minifi::core::controller::StandardControllerServiceProvider] [info] Enabling % controller services
[2018-01-26 16:24:17.899] [org::apache::nifi::minifi::c2::C2Agent] [info] Class is RESTSender
[2018-01-26 16:24:17.902] [org::apache::nifi::minifi::io::Socket] [error] Could not bind to socket
[2018-01-26 16:24:17.903] [org::apache::nifi::minifi::FlowController] [info] Started Flow Controller
[2018-01-26 16:24:17.903] [main] [info] MiNiFi started
root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0/bin# ./minificontroller --list components
[2018-01-infol 16:25:16.461] [controller] [info] MINIFI_HOME is not set; determining based on environment.
[2018-01-infol 16:25:16.462] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi.properties
[2018-01-infol 16:25:16.463] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi-log.properties
Components:
FlowController
root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0/bin# ./minificontroller --list connections
[2018-01-infol 16:25:32.850] [controller] [info] MINIFI_HOME is not set; determining based on environment.
[2018-01-infol 16:25:32.851] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi.properties
[2018-01-infol 16:25:32.852] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi-log.properties
Connection Names:
./minificontroller --updateflow "config yml"
root@orangepizero:/opt/demo/nifi-minifi-cpp-0.4.0/bin# ./minificontroller --getfull
[2018-01-infol 16:26:13.296] [controller] [info] MINIFI_HOME is not set; determining based on environment.
[2018-01-infol 16:26:13.297] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi.properties
[2018-01-infol 16:26:13.298] [org::apache::nifi::minifi::Properties] [info] Using configuration file located at /opt/demo/nifi-minifi-cpp-0.4.0/conf/minifi-log.properties
0 are full
References
https://github.com/apache/nifi-minifi-cpp
https://cwiki.apache.org/confluence/display/MINIFI/C2+Design+Proposal
https://github.com/apache/nifi-minifi-cpp/blob/master/examples/BidirectionalSiteToSite/README.md
https://nifi.apache.org/minifi/getting-started.html
https://github.com/apache/nifi-minifi-cpp/blob/master/PROCESSORS.md#
https://github.com/apache/nifi-minifi-cpp/blob/master/EXPRESSIONS.md
https://cwiki.apache.org/confluence/display/MINIFI/Release+Notes#ReleaseNotes-Versioncpp-0.3.0
https://github.com/apache/nifi-minifi-cpp/blob/master/Extensions.md
To Customize C++ Builds
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74685143
Build Extensions with MiniFi C++
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74685988
MiniFi C++ System properties
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=70256416
... View more
Labels:
01-28-2018
03:57 PM
2 Kudos
Using SiteToSiteProvenanceReportingTask to Send Provenance to Apache NiFi for Processing.
Eating our own provenance food! It's almost comically easy to do this. You set up a task on the server you are reporting on that sends the data to your receiver. That other server you make a simple flow to ingest and process that. I stored it to HBase as JSON as it's a good place to put a lot of data fast.
Send The Data You need to create a SiteToSiteProvenanceReportingTask in Controller Settings - Reporting Tasks. It's pretty simple. Set the values above with your destination NiFi server and a port name that you have created already.
Receive the Data and Process
An Individual JSON Record
Split the JSON into Records
$.[*]
Save to HBase (PutHBaseJSON)
First I have to create a table. hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 1.1.2.2.6.2.0-205, r5210d2ed88d7e241646beab51e9ac147a973bdcc, Sat Aug 26 09:33:50 UTC 2017
hbase(main):001:0> create 'PROVENANCE', 'event'
0 row(s) in 2.9900 seconds
=> Hbase::Table - PROVENANCE
scan 'PROVENANCE'
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousAttributes, timestamp=1517159115042, value={"path":"./","filename":"humidity.583225-583284.log","s2s.address":"192.168.1.197:55032","s2s.host":"1
92.168.1.197","mime.type":"text/plain","uuid":"9006a1bb-d755-4272-b8d3-76e666c2a7c6","tailfile.original.path":"/opt/demo/logs/humidity.log"}
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousContentURI, timestamp=1517159115042, value=http://192.168.1.193:8080/nifi-api/provenance-events/61825/content/input
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:previousEntitySize, timestamp=1517159115042, value=59
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:processGroupId, timestamp=1517159115042, value=01611005-4e82-1491-ae5d-ca64f59491cb
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:processGroupName, timestamp=1517159115042, value=Process MiniFi Creator
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:timestamp, timestamp=1517159115042, value=2018-01-28T00:25:30.616Z
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:timestampMillis, timestamp=1517159115042, value=1517099130616
ff91e204-05b0-48aa-a666-7942e3f109ab column=event:updatedAttributes, timestamp=1517159115042, value={"RouteOnAttribute.Route":"humidity"}
ffde140c-3053-4b9d-89c6-14b68025384d column=event:actorHostname, timestamp=1517159114898, value=192.168.1.193
ffde140c-3053-4b9d-89c6-14b68025384d column=event:application, timestamp=1517159114898, value=NiFi Flow
ffde140c-3053-4b9d-89c6-14b68025384d column=event:childIds, timestamp=1517159114898, value=[]
ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentId, timestamp=1517159114898, value=3a25cda9-0161-1000-813c-631724a10585
ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentName, timestamp=1517159114898, value=RouteOnAttribute
ffde140c-3053-4b9d-89c6-14b68025384d column=event:componentType, timestamp=1517159114898, value=RouteOnAttribute
ffde140c-3053-4b9d-89c6-14b68025384d column=event:contentURI, timestamp=1517159114898, value=http://192.168.1.193:8080/nifi-api/provenance-events/61701/content/output
ffde140c-3053-4b9d-89c6-14b68025384d column=event:durationMillis, timestamp=1517159114898, value=-1
ffde140c-3053-4b9d-89c6-14b68025384d column=event:entityId, timestamp=1517159114898, value=9b017666-7ce9-45c5-9d0a-2f81e56d6fa8
ffde140c-3053-4b9d-89c6-14b68025384d column=event:entitySize, timestamp=1517159114898, value=16
ffde140c-3053-4b9d-89c6-14b68025384d column=event:entityType, timestamp=1517159114898, value=org.apache.nifi.flowfile.FlowFile
ffde140c-3053-4b9d-89c6-14b68025384d column=event:eventOrdinal, timestamp=1517159114898, value=61701
ffde140c-3053-4b9d-89c6-14b68025384d column=event:eventType, timestamp=1517159114898, value=ROUTE
ffde140c-3053-4b9d-89c6-14b68025384d column=event:lineageStart, timestamp=1517159114898, value=1517084974341
ffde140c-3053-4b9d-89c6-14b68025384d column=event:parentIds, timestamp=1517159114898, value=[]
ffde140c-3053-4b9d-89c6-14b68025384d column=event:platform, timestamp=1517159114898, value=nifi
ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousAttributes, timestamp=1517159114898, value={"path":"./","filename":"uv.164064-164080.log","s2s.address":"192.168.1.197:55032","s2s.host":"192.168
.1.197","mime.type":"text/plain","uuid":"9b017666-7ce9-45c5-9d0a-2f81e56d6fa8","tailfile.original.path":"/opt/demo/logs/uv.log"}
ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousContentURI, timestamp=1517159114898, value=http://192.168.1.193:8080/nifi-api/provenance-events/61701/content/input
ffde140c-3053-4b9d-89c6-14b68025384d column=event:previousEntitySize, timestamp=1517159114898, value=16
ffde140c-3053-4b9d-89c6-14b68025384d column=event:processGroupId, timestamp=1517159114898, value=01611005-4e82-1491-ae5d-ca64f59491cb
ffde140c-3053-4b9d-89c6-14b68025384d column=event:processGroupName, timestamp=1517159114898, value=Process MiniFi Creator
ffde140c-3053-4b9d-89c6-14b68025384d column=event:timestamp, timestamp=1517159114898, value=2018-01-28T00:25:30.607Z
ffde140c-3053-4b9d-89c6-14b68025384d column=event:timestampMillis, timestamp=1517159114898, value=1517099130607
ffde140c-3053-4b9d-89c6-14b68025384d column=event:updatedAttributes, timestamp=1517159114898, value={"RouteOnAttribute.Route":"uv"}
1830 row(s) in 11.7680 seconds
provenancereporting.xml Learning to Use HBase https://hortonworks.com/hadoop-tutorial/introduction-apache-hbase-concepts-apache-phoenix-new-backup-restore-utility-hbase/
... View more
Labels: