Community Articles

TimothySpann · ‎02-27-2018

This is for people preparing to attend my talk on Deep Learning at DataWorks Summit Berling 2018 (https://dataworkssummit.com/berlin-2018/#agenda) on Thursday April 19, 2018 at 11:50AM Berlin time.

See: https://community.hortonworks.com/content/kbentry/174399/apache-deep-learning-101-using-apache-mxnet...

To do proper analytics and provide fast SQL access to our inception data generated by Apache MXNet from our images, we need to land it into Apache Hive Transactional tables. We will use the Apache NiFi PutHiveStreaming processor to insert data into our ACID table at a rapid rate. This only works if you create a transactional table with Apache ORC, see the DDL below. You must also be running a new version of HDP 2.6+ that has ACID turned on.

Tip: In HDP 2.6.4, you will need to create and work with Apache Hive ACID tables with Hive. Not sql in Apache Zeppelin, since that is Apache Spark. jdbc(hive) is Apache Hive. See the configuration below to hive CBO and TEZ enabled as well.

Ambari View of Hive

SQL DDL

%jdbc(hive) 

CREATE TABLE `inception`(
uuid STRING, top1pct STRING, top1 STRING, top2pct STRING, top2 STRING, top3pct STRING, top3 STRING, top4pct STRING, top4 STRING, top5pct STRING, top5 STRING, imagefilename STRING, 
runtime STRING)
CLUSTERED BY ( top1) 
INTO 3 BUCKETS
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
TBLPROPERTIES (  'transactional'='true')


%jdbc(hive)
select * from inception

The PutHiveStreaming processor requires that you have a table that is bucketed, uses Apache ORC and you have permissions. See the example above for a table DDL to use. You also need ACID and LLAP enabled on your Apache Hive cluster.

Details for PutHiveStreaming Processor

An Example Apache MXNet to Hive Streaming View

The Hive View 2.0 of the Data

Apache Zeppelin Table DDL and Query

Cloudera Community

Community Articles

Apache Deep Learning 101: Using Apache MXNet with Hive Streaming ACID Tables

Apache Hive

Apache Deep Learning 101: Processing Apache MXNet...

Apache Deep Learning 101: Using Apache MXNet in Ap...

Deep Learning 101: Using Apache MXNet in DSX Noteb...

Apache Deep Learning 101: Using Apache MXNet on Th...

Ingesting Apache MXNet Gluon Deep Learning Results...

Using Apache MXNet GluonCV with Apache NiFi for De...

Apache NiFi Processor for Apache MXNet SSD: Single...

Implementing Streaming Machine Learning and Deep L...

Machine Learning with SQL using Apache Hive and Hi...

Apache Metron TP1 Deep Dive