Community Articles

TimothySpann · ‎03-12-2017

There are two great additions you can make to your current Hive. The first is HPL/SQL that brings stored procedure programming to the Hadoop world. The second is Hive Mall which brings advanced functions and machine learning to your Hive queries.

HPL/SQL

HPL/SQL is included in Hive 2.0 and will be included in Hive 2.1 on HDP 2.6. You can manually download and install it now. It is Hybrid Procedural SQL on Hadoop. For developers coming from Oracle and SQL Server, these procedures will feel very familiar and will allow you to port a lot of your existing PL/SQL and TSQL code over to Hive.

This gives you another interface to Hive and Hadoop, it will be included in future Hadoop and be tied into the very fast Hive LLAP 2.1.

HPL/SQL

To Run A Stored Procedure

cd hplsql-0.3.17 
./hplsql -f proc.pl

HP/SQL Stored Procedure Example

create procedure fn_test1 (VarOne char(25))    
BEGIN    
SET plhql
execute immediate 'set hive.exec.dynamic.partition.mode=nonstrict'; 
execute immediate 'set hive.exec.dynamic.partition=true';    
execute immediate 'SET hive.execution.engine=tez';    
print VarOne;  
set VarOne = Upper(VarOne);  
if (VarOne not in ('STUFF', 'STUFF2'))  
BEGIN  
print 'Bad Data';  
RETURN -1;   END
print 'Good Data';
END;
print call fn_test1('STUFF');
./hplsql -f proc.pl
Call
17/03/09 20:04:03 INFO jdbc.Utils: Supplied authorities: localhost:10000
17/03/09 20:04:03 INFO jdbc.Utils: Resolved authority: 
localhost:10000
Open connection: jdbc:hive2://localhost:10000 (266 ms)
Starting SQL statementSQL statement executed successfully (2 ms)
Starting SQL statementSQL statement executed successfully (2 ms)
Starting SQL statementSQL statement executed successfully (1 ms)
*STUFF*
Good Data

Apache HiveMall

HiveMall was developed by developers from Treasure Data, NTT and Hortonworks.

set hivevar:hivemall_jar=hdfs:///apps/hivemall/hivemall-with-dependencies.jar; 
source /opt/demo/define-all-as-permanent.hive;

HiveMall is a scalable machine learning library built as a collection of Hive UDFs that you can run through Hive, Spark and Pig. It brings very cool processing to your Hive queries, Zeppelin, Pig and Spark code. You will be able to combine Hive Mall machine learning with stored procedures on in-memory fast LLAP Hive. This is revolutionary. You can run this via near real-time Apache NiFi streams.

Cloudera Community

Community Articles

Adding and Using HPL/SQL and HiveMall with Hive: Machine Learning, Functions and Stored Procedures

Apache Hive

Machine Learning with SQL using Apache Hive and Hi...

How to use Model Registry on Cloudera Machine Lear...

Supervised Learning with Apache Hive and Hivemall

PandasOnSpark in Cloudera Machine Learning (CML)

Using HPL/SQL with HDP 2.6 (Unsupported)

Writing files to Cloudera Machine Learning using A...

Cloudera Machine Learning (CML) - Questions & Answ...

Apache Hive with Apache Hivemall

Using Cloudera Machine Learning to run a Data Engi...

Fraud Detection - Applied Machine Learning Prototy...