Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar
Master Guru

Using Apache MXNet GluonCV with Apache NiFi for Deep Learning Computer Vision

Source: https://github.com/tspannhw/OpenSourceComputerVision/

Gluon and Apache MXNet have been great for deep learning especially for newbies like me. It got even better! They added a Deep Learning Toolkit that is easy to use and has a number of great pre-trained models that you can easily use to do some general use cases around computer vision. So I have used a simple well-documented example that I tweaked to save the final image and send some JSON details via MQTT to Apache NiFi.

This may sound familiar: https://community.hortonworks.com/articles/198912/ingesting-apache-mxnet-gluon-deep-learning-results...

GluonCV makes this even easier! Let's check it out. Again let's take a simpmle Python example tweak it, run it via a shell script and send the results over MQTT.

See: https://gluon-cv.mxnet.io/build/examples_detection/demo_ssd.html#sphx-glr-build-examples-detection-d...

Python Code: https://github.com/tspannhw/UsingGluonCV/tree/master


This is the Saved Annotated Figure

78466-gluonpic.jpg

Simple Apache NiFi Flow to Ingest MQTT Data from GluonCV example Python and Store to Hive and Parquet and HBase.

78467-gluoncvflow1.png

A simple flow:

  1. ConsumeMQTT
  2. InferAvroSchema
  3. RouteOnContent
  4. MergeRecord (convert batches of json to single avro)
  5. ConvertAvroToORC
  6. PutHDFS
  7. PutParquet
  8. PutHbaseRecord

Again Apache NiFi generates a schema for us from data examination. There's a really cool project coming out of New Jersey that has advanced schema generation looking at tables, I'll report on that later. We take it add, save to Schema Registry and are ready to Merge Records. One thing you may want to add is to turn regular types from: "type": "string" to "type": ["string","null"].

Schema

{
 "type": "record",
 "name": "gluoncv",
 "fields": [
  {
   "name": "imgname",
   "type": "string",
   "doc": "Type inferred from '\"images/gluoncv_image_20180615203319_6e0e5f0b-d2aa-4e94-b7e9-8bb7f29c9512.jpg\"'"
  },
  {
   "name": "host",
   "type": "string",
   "doc": "Type inferred from '\"HW13125.local\"'"
  },
  {
   "name": "shape",
   "type": "string",
   "doc": "Type inferred from '\"(1, 3, 512, 910)\"'"
  },
  {
   "name": "end",
   "type": "string",
   "doc": "Type inferred from '\"1529094800.88097\"'"
  },
  {
   "name": "te",
   "type": "string",
   "doc": "Type inferred from '\"2.4256367683410645\"'"
  },
  {
   "name": "battery",
   "type": "int",
   "doc": "Type inferred from '100'"
  },
  {
   "name": "systemtime",
   "type": "string",
   "doc": "Type inferred from '\"06/15/2018 16:33:20\"'"
  },
  {
   "name": "cpu",
   "type": "double",
   "doc": "Type inferred from '23.2'"
  },
  {
   "name": "diskusage",
   "type": "string",
   "doc": "Type inferred from '\"112000.8 MB\"'"
  },
  {
   "name": "memory",
   "type": "double",
   "doc": "Type inferred from '65.8'"
  },
  {
   "name": "id",
   "type": "string",
   "doc": "Type inferred from '\"20180615203319_6e0e5f0b-d2aa-4e94-b7e9-8bb7f29c9512\"'"
  }
 ]
}

Example JSON

{"imgname": "images/gluoncv_image_20180615203615_c83fed6f-2ec8-4841-97e3-40985f7859ad.jpg", "host": "HW13125.local", "shape": "(1, 3, 512, 910)", "end": "1529094976.237143", "te": "1.8907802104949951", "battery": 100, "systemtime": "06/15/2018 16:36:16", "cpu": 29.3, "diskusage": "112008.6 MB", "memory": 66.5, "id": "20180615203615_c83fed6f-2ec8-4841-97e3-40985f7859ad"}

Table Generated

CREATE EXTERNAL TABLE IF NOT EXISTS gluoncv (imgname STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING)

STORED AS ORC

LOCATION '/gluoncv'

Parquet Table

create external table gluoncv_parquet (imgname STRING, host STRING, shape STRING, end STRING, te STRING, battery INT, systemtime STRING, cpu DOUBLE, diskusage STRING, memory DOUBLE, id STRING)

STORED AS PARQUET

LOCATION '/gluoncvpar'

Reference:

https://gluon-cv.mxnet.io/

https://gluon-cv.mxnet.io/build/examples_detection/index.html

https://medium.com/apache-mxnet/gluoncv-deep-learning-toolkit-for-computer-vision-9218a907e8da

1,252 Views