Created on 02-23-201806:47 PM - edited 08-17-201908:46 AM
This is for people preparing to attend my talk on Deep Learning at DataWorks Summit Berling 2018 (https://dataworkssummit.com/berlin-2018/#agenda) on Thursday April 19, 2018 at 11:50AM Berlin time.
This is for running Apache MXNet on an HDF 3.1 node with Centos 7.
This first flow retrieves images from the picsum.photos API, stores it locally and then runs some basic processing. The first branch extracts all the metadata we can. The second branch will call our example Inception Apache MXNet Python script for image recognition. The script returns a JSON file that we will process with the same processing code that is used by the local version of this program.
Once we funnel that out our process group, we send it to the MXNet processing group which will convert the JSON to Apache AVRO and then to Apache ORC for storage in HDFS to be used as an external Apache Hive table.
Our Schema hosted in Hortonworks Schema Registry
Examining The Picture with ExtractMedia...
To Execute Apache MXNet Installed on HDF Node
An Example Image Loaded From the API
Exploring the data with Apache Hive SQL in Apache Zeppelin on HDP 2.6.4