Gluon and Apache MXNet have been great for deep learning especially for newbies like me. It got even better! They added a Deep Learning Toolkit that is easy to use and has a number of great pre-trained models that you can easily use to do some general use cases around computer vision. So I have used a simple well-documented example that I tweaked to save the final image and send some JSON details via MQTT to Apache NiFi.
GluonCV makes this even easier! Let's check it out. Again let's take a simpmle Python example tweak it, run it via a shell script and send the results over MQTT.
Simple Apache NiFi Flow to Ingest MQTT Data from GluonCV example Python and Store to Hive and Parquet and HBase.
A simple flow:
ConsumeMQTT
InferAvroSchema
RouteOnContent
MergeRecord (convert batches of json to single avro)
ConvertAvroToORC
PutHDFS
PutParquet
PutHbaseRecord
Again Apache NiFi generates a schema for us from data examination. There's a really cool project coming out of New Jersey that has advanced schema generation looking at tables, I'll report on that later. We take it add, save to Schema Registry and are ready to Merge Records. One thing you may want to add is to turn regular types from: "type": "string" to "type": ["string","null"].
CREATE EXTERNAL TABLE IF NOT EXISTS gluoncv (imgname STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING)
STORED AS ORC
LOCATION '/gluoncv'
Parquet Table
create external table gluoncv_parquet (imgname STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING)