- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 06-16-2018 02:38 PM - edited 08-17-2019 07:09 AM
Using Apache MXNet GluonCV with Apache NiFi for Deep Learning Computer Vision
Source: https://github.com/tspannhw/OpenSourceComputerVision/
Gluon and Apache MXNet have been great for deep learning especially for newbies like me. It got even better! They added a Deep Learning Toolkit that is easy to use and has a number of great pre-trained models that you can easily use to do some general use cases around computer vision. So I have used a simple well-documented example that I tweaked to save the final image and send some JSON details via MQTT to Apache NiFi.
This may sound familiar: https://community.hortonworks.com/articles/198912/ingesting-apache-mxnet-gluon-deep-learning-results...
GluonCV makes this even easier! Let's check it out. Again let's take a simpmle Python example tweak it, run it via a shell script and send the results over MQTT.
Python Code: https://github.com/tspannhw/UsingGluonCV/tree/master
This is the Saved Annotated Figure
Simple Apache NiFi Flow to Ingest MQTT Data from GluonCV example Python and Store to Hive and Parquet and HBase.
A simple flow:
- ConsumeMQTT
- InferAvroSchema
- RouteOnContent
- MergeRecord (convert batches of json to single avro)
- ConvertAvroToORC
- PutHDFS
- PutParquet
- PutHbaseRecord
Again Apache NiFi generates a schema for us from data examination. There's a really cool project coming out of New Jersey that has advanced schema generation looking at tables, I'll report on that later. We take it add, save to Schema Registry and are ready to Merge Records. One thing you may want to add is to turn regular types from: "type": "string" to "type": ["string","null"].
Schema
{ "type": "record", "name": "gluoncv", "fields": [ { "name": "imgname", "type": "string", "doc": "Type inferred from '\"images/gluoncv_image_20180615203319_6e0e5f0b-d2aa-4e94-b7e9-8bb7f29c9512.jpg\"'" }, { "name": "host", "type": "string", "doc": "Type inferred from '\"HW13125.local\"'" }, { "name": "shape", "type": "string", "doc": "Type inferred from '\"(1, 3, 512, 910)\"'" }, { "name": "end", "type": "string", "doc": "Type inferred from '\"1529094800.88097\"'" }, { "name": "te", "type": "string", "doc": "Type inferred from '\"2.4256367683410645\"'" }, { "name": "battery", "type": "int", "doc": "Type inferred from '100'" }, { "name": "systemtime", "type": "string", "doc": "Type inferred from '\"06/15/2018 16:33:20\"'" }, { "name": "cpu", "type": "double", "doc": "Type inferred from '23.2'" }, { "name": "diskusage", "type": "string", "doc": "Type inferred from '\"112000.8 MB\"'" }, { "name": "memory", "type": "double", "doc": "Type inferred from '65.8'" }, { "name": "id", "type": "string", "doc": "Type inferred from '\"20180615203319_6e0e5f0b-d2aa-4e94-b7e9-8bb7f29c9512\"'" } ] }
Example JSON
{"imgname": "images/gluoncv_image_20180615203615_c83fed6f-2ec8-4841-97e3-40985f7859ad.jpg", "host": "HW13125.local", "shape": "(1, 3, 512, 910)", "end": "1529094976.237143", "te": "1.8907802104949951", "battery": 100, "systemtime": "06/15/2018 16:36:16", "cpu": 29.3, "diskusage": "112008.6 MB", "memory": 66.5, "id": "20180615203615_c83fed6f-2ec8-4841-97e3-40985f7859ad"}
Table Generated
CREATE EXTERNAL TABLE IF NOT EXISTS gluoncv (imgname STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING)
STORED AS ORC
LOCATION '/gluoncv'
Parquet Table
create external table gluoncv_parquet (imgname STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING)
STORED AS PARQUET
LOCATION '/gluoncvpar'
Reference:
https://gluon-cv.mxnet.io/build/examples_detection/index.html
https://medium.com/apache-mxnet/gluoncv-deep-learning-toolkit-for-computer-vision-9218a907e8da