1973
Posts
1225
Kudos Received
124
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1908 | 04-03-2024 06:39 AM | |
| 3008 | 01-12-2024 08:19 AM | |
| 1642 | 12-07-2023 01:49 PM | |
| 2418 | 08-02-2023 07:30 AM | |
| 3351 | 03-29-2023 01:22 PM |
06-15-2018
04:11 PM
Adding Parquet Output https://cwiki.apache.org/confluence/display/Hive/Parquet create external table gluon2_parquet (top1pct STRING, top2pct STRING, top3pct STRING, top4pct STRING, top5pct STRING, top1 STRING, top2 STRING, top3 STRING, top4 STRING, top5 STRING,
imgname STRING, host STRING, `end` STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING)
STORED AS PARQUET
LOCATION '/gluon2par' select * from gluon2_parquet Add the PutParquet Processor
... View more
06-15-2018
03:16 PM
3 Kudos
Ingesting Apache MXNet Gluon Deep Learning Results Via MQTT and Apache NiFi Summary: Using a Pre-Trained Model in Apache MXNet Gluon Python 3 code to classify a webcam image captured and processed with OpenCV. In our Python script, we capture the image to disk and capture JSON metadata about the percentage, probabilities and device information. This JSON data is then sent via MQTT to a broker. Apache NiFi processes the JSON data. Example Image Source Code Schema: https://github.com/tspannhw/OpenSourceComputerVision/blob/master/gluon2.avsc Python Source: https://github.com/tspannhw/OpenSourceComputerVision/blob/master/nifigluon2.py Shell Script: https://github.com/tspannhw/OpenSourceComputerVision/blob/master/rungluon2.sh SQL Table DDL CREATE EXTERNAL TABLE IF NOT EXISTS gluon2 (top1pct STRING, top2pct STRING, top3pct STRING,
top4pct STRING, top5pct STRING, top1 STRING, top2 STRING, top3 STRING, top4 STRING,
top5 STRING, imgname STRING, host STRING, end STRING, te STRING, battery INT,
systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING)
STORED AS ORC
LOCATION '/gluon2' Technologies: Python 3, Apache MXNet, Gluon, MQTT, Apache NiFi, OpenCV. Based on http://gluon-crash-course.mxnet.io/predict.html Apache NiFi Overview ' Steps ConsumeMQTT: Ingest MQTT data from gluon2 topic sent from Python InferAvroSchema: One time grab the schema, then you can remove this processor. RouteOnContent: Throw away errors MergeRecord: Convert many JSON records into one large Apache AVRO file ConvertAvroToORC: Convert that Apache AVRO File into an Apache ORC file PutHDFS: Store the ApacheORC file in HDFS. A side effect of the process is that is produces a SQL DDL to create a new table for this schema. Table Example
... View more
Labels:
06-14-2018
09:15 PM
We are analyzing this unsplash picture https://raw.githubusercontent.com/tspannhw/DWS-DeepLearning-CrashCourse/master/photo1.jpg
... View more
06-14-2018
06:45 PM
This is a continuation of my series on running TensorFlow and Apache MXNet applications in HDP, HDF and in edge nodes. https://community.hortonworks.com/articles/118132/minifi-capturing-converting-tensorflow-inception-t.html https://dzone.com/articles/integrating-tensorflow-16-image-labelling-with-hdf https://community.hortonworks.com/articles/80339/iot-capturing-photos-and-analyzing-the-image-with.html https://community.hortonworks.com/articles/103863/using-an-asus-tinkerboard-with-tensorflow-and-pyth.html https://community.hortonworks.com/articles/83100/deep-learning-iot-workflows-with-raspberry-pi-mqtt.html
... View more
06-14-2018
06:30 PM
3 Kudos
Executing TensorFlow Classifications from Apache NiFi Using Apache Spark 2.3 and Apache Livy Technology: Apache Spark 2.3 + Apache Livy + Apache NiFi 1.5 + TensorFlow + Python Python Code: https://github.com/tspannhw/DWS-DeepLearning-CrashCourse/blob/master/tensorflowsparknifi.py TIP: In this version of Apache NiFi, you need to use double quotes (") instead of single quotes (') in your Python code. Python Code for NiFi ExecuteSparkInteractive See Github Simple Apache NiFi Flow To Execute TensorFlow Python Applications via Apache Livy I am just using Apache Livy as the transport call from Apache NiFi to Apache Spark. My Apache Spark 2.3 cluster is not doing any Spark specific processing. PySpark is just running a vanilla TensorFlow python application in this version. We could also call TensorFlow on Spark code in this way. My goal was to run TensorFlow on my Spark cluster trigger from Apache NiFi and get back results. Results Returned in Success From ExecuteSparkInteractive Call {
"text/plain" : "273\tracer, race car, racing car\t37.4601334333%\n\n274\tsports car, sport car\t25.3520905972%\n\n267\tcab, hack, taxi, taxicab\t11.1182622612%\n\n268\tconvertible\t9.85431224108%\n\n271\tminivan\t3.22951599956%"
} Apache Livy UI Showing Results of Runs This is the ExecuteSparkInteractive Processor. We can put the code in the Code property or pass it in. Let's Configure a PySpark Apache Livy Controller LogSearch There is a technical preview of LogSearch which is great for finding issues in HDF components or HDP components. This is easier then searching logs. Though I can easily write NiFi code to search logs as well. References: https://community.hortonworks.com/articles/177663/apache-livy-apache-nifi-apache-spark-executing-sca.html https://community.hortonworks.com/articles/171787/hdf-31-executing-apache-spark-via-executesparkinte.html
... View more
Labels:
06-11-2018
02:35 PM
1 Kudo
This is trivial in NiFi. https://blogs.apache.org/nifi/entry/integrating_apache_nifi_with_apache https://community.hortonworks.com/articles/122077/ingesting-csv-data-and-pushing-it-as-avro-to-kafka.html
... View more
05-26-2018
10:55 PM
3 Kudos
Integrating Keras (TensorFlow) YOLOv3 Into Apache NiFi Workflows For this article I wanted to try the new YOLOv3 that's running in Keras. Out of the box with video streaming, pretty cool: git clone https://github.com/qqwweee/keras-yolo3
wget https://pjreddie.com/media/files/yolov3.weights
python convert.py yolov3.cfg yolov3.weights model_data/yolo.h5
python yolo.py See: https://github.com/qqwweee/keras-yolo3 My article on Darknet original YOLO v3 is here: https://community.hortonworks.com/articles/191259/integrating-darknet-yolov3-into-apache-nifi-workfl.html Example JSON Output {
"boxes" : "Found 8 boxes for img",
"class7" : "diningtable",
"score7" : "0.7484486",
"left7" : "636",
"right7" : "1096",
"top7" : "210",
"bottom7" : "693",
"class6" : "sofa",
"score6" : "0.31372178",
"left6" : "1114",
"right6" : "1276",
"top6" : "172",
"bottom6" : "381",
"class5" : "chair",
"score5" : "0.3455438",
"left5" : "990",
"right5" : "1048",
"top5" : "183",
"bottom5" : "246",
"class4" : "chair",
"score4" : "0.34554565",
"left4" : "858",
"right4" : "1000",
"top4" : "186",
"bottom4" : "244",
"class3" : "chair",
"score3" : "0.87056005",
"left3" : "1114",
"right3" : "1276",
"top3" : "172",
"bottom3" : "381",
"class2" : "chair",
"score2" : "0.9683409",
"left2" : "958",
"right2" : "1151",
"top2" : "203",
"bottom2" : "482",
"class1" : "cup",
"score1" : "0.49115792",
"left1" : "691",
"right1" : "770",
"top1" : "84",
"bottom1" : "229",
"class0" : "person",
"score0" : "0.9980049",
"left0" : "187",
"right0" : "709",
"top0" : "2",
"bottom0" : "720",
"host" : "HW13125.local.fios-router.home",
"end" : "1527443620.644303",
"te" : "4.380625247955322",
"battery" : 100,
"systemtime" : "05/27/2018 13:53:40",
"cpu" : 19.7,
"diskusage" : "140607.7 MB",
"memory" : 65.8,
"yoloid" : "20180527175341_284c6051-a45d-4757-977a-fe1dd76f295b"
}
I created a YOLO ID so that the JSON, the ID and the image would have that ID for keeping them in sync as we send them across various distributed networks into a cluster for storage. I also have some other metadata I thought would be helpful such as date time, run time, battery available, disk usage, cpu and memory. Those are quick and easy to grab with Python and could be useful. This is running on a Mac laptop. This could be ported to the NVIDIA Jetson TX1. It may work on the RPI3 with Movidius, but I think it may be a touch slow. Even on a Mac with no GPU and some stuff running I am getting an image every 2-3 seconds produced. It's very choppy, I would like to try this on an UBuntu workstation with a few NVidia high end GPUs and TensorFlow compiled for GPU. NiFi Flow for Ingest of JSON and Images NiFi Server to process and store data in Hive and HBase Read JSON from the Logs Directory Written by Python 3 Ingest Images from Images Directory Written by YOLO v3 TensorFlow Analysis of Captured Image You can watch the data arrive To Write Data to HBase is Easy (Just create an HBase table with a column family) Apache NiFi generates my Hive table Once we have a generate table it is populated with our data and we can query it in Apache Zeppelin (or any JDBC/ODBC Tool) Using InferAvroSchema we had a schema created, we store it in Hortonworks Schema Registry for use. Sample of Data Stored in HBase 20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:boxes, timestamp=1527456044351, value=Found 9 boxes for img
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class0, timestamp=1527456044351, value=person
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class1, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class2, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class3, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class4, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class5, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class6, timestamp=1527456044351, value=chair
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class7, timestamp=1527456044351, value=sofa
20180527175609_579164b3-031c-4993-bec1-50ff4cf58d1c column=yolo:class8, timestamp=1527456044351, value=diningtable Flow (Local, Server) and Zeppelin Notebook: yolo-keras-json-server-save.xml keras-tensorflow-yolo-v3-osx.xml yolo-copy.json Forked Python Code For Saving JSON and Images https://github.com/tspannhw/yolo3-keras-tensorflow Coming Soon: Live Recording of YOLO v3 with Keras/TensorFlow recording of the capture stream.
... View more
Labels: