Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar
Master Guru

There are two great additions you can make to your current Hive. The first is HPL/SQL that brings stored procedure programming to the Hadoop world. The second is Hive Mall which brings advanced functions and machine learning to your Hive queries.

HPL/SQL

HPL/SQL is included in Hive 2.0 and will be included in Hive 2.1 on HDP 2.6. You can manually download and install it now. It is Hybrid Procedural SQL on Hadoop. For developers coming from Oracle and SQL Server, these procedures will feel very familiar and will allow you to port a lot of your existing PL/SQL and TSQL code over to Hive.

This gives you another interface to Hive and Hadoop, it will be included in future Hadoop and be tied into the very fast Hive LLAP 2.1.

HPL/SQL

To Run A Stored Procedure

cd hplsql-0.3.17 
./hplsql -f proc.pl 

HP/SQL Stored Procedure Example

create procedure fn_test1 (VarOne char(25))    
BEGIN    
SET plhql
execute immediate 'set hive.exec.dynamic.partition.mode=nonstrict'; 
execute immediate 'set hive.exec.dynamic.partition=true';    
execute immediate 'SET hive.execution.engine=tez';    
print VarOne;  
set VarOne = Upper(VarOne);  
if (VarOne not in ('STUFF', 'STUFF2'))  
BEGIN  
print 'Bad Data';  
RETURN -1;   END
print 'Good Data';
END;
print call fn_test1('STUFF');
./hplsql -f proc.pl
Call
17/03/09 20:04:03 INFO jdbc.Utils: Supplied authorities: localhost:10000
17/03/09 20:04:03 INFO jdbc.Utils: Resolved authority: 
localhost:10000
Open connection: jdbc:hive2://localhost:10000 (266 ms)
Starting SQL statementSQL statement executed successfully (2 ms)
Starting SQL statementSQL statement executed successfully (2 ms)
Starting SQL statementSQL statement executed successfully (1 ms)
*STUFF*
Good Data 

13528-hivemall2.png

Apache HiveMall

HiveMall was developed by developers from Treasure Data, NTT and Hortonworks.

set hivevar:hivemall_jar=hdfs:///apps/hivemall/hivemall-with-dependencies.jar; 
source /opt/demo/define-all-as-permanent.hive; 

HiveMall is a scalable machine learning library built as a collection of Hive UDFs that you can run through Hive, Spark and Pig. It brings very cool processing to your Hive queries, Zeppelin, Pig and Spark code. You will be able to combine Hive Mall machine learning with stored procedures on in-memory fast LLAP Hive. This is revolutionary. You can run this via near real-time Apache NiFi streams.

13541-hivemallzepp.png

9,533 Views