- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 03-12-2017 04:53 PM - edited 08-17-2019 01:51 PM
There are two great additions you can make to your current Hive. The first is HPL/SQL that brings stored procedure programming to the Hadoop world. The second is Hive Mall which brings advanced functions and machine learning to your Hive queries.
HPL/SQL
HPL/SQL is included in Hive 2.0 and will be included in Hive 2.1 on HDP 2.6. You can manually download and install it now. It is Hybrid Procedural SQL on Hadoop. For developers coming from Oracle and SQL Server, these procedures will feel very familiar and will allow you to port a lot of your existing PL/SQL and TSQL code over to Hive.
This gives you another interface to Hive and Hadoop, it will be included in future Hadoop and be tied into the very fast Hive LLAP 2.1.
HPL/SQL
- https://community.hortonworks.com/content/idea/43847/hplsql-make-sql-on-hadoop-more-dynamic.html
- http://www.hplsql.org/connections
- http://www.hplsql.org/cli
- http://www.hplsql.org/download
- http://www.hplsql.org/start
To Run A Stored Procedure
cd hplsql-0.3.17 ./hplsql -f proc.pl
HP/SQL Stored Procedure Example
create procedure fn_test1 (VarOne char(25)) BEGIN SET plhql execute immediate 'set hive.exec.dynamic.partition.mode=nonstrict'; execute immediate 'set hive.exec.dynamic.partition=true'; execute immediate 'SET hive.execution.engine=tez'; print VarOne; set VarOne = Upper(VarOne); if (VarOne not in ('STUFF', 'STUFF2')) BEGIN print 'Bad Data'; RETURN -1; END print 'Good Data'; END; print call fn_test1('STUFF'); ./hplsql -f proc.pl Call 17/03/09 20:04:03 INFO jdbc.Utils: Supplied authorities: localhost:10000 17/03/09 20:04:03 INFO jdbc.Utils: Resolved authority: localhost:10000 Open connection: jdbc:hive2://localhost:10000 (266 ms) Starting SQL statementSQL statement executed successfully (2 ms) Starting SQL statementSQL statement executed successfully (2 ms) Starting SQL statementSQL statement executed successfully (1 ms) *STUFF* Good Data
Apache HiveMall
HiveMall was developed by developers from Treasure Data, NTT and Hortonworks.
- https://community.hortonworks.com/articles/67983/apache-hive-with-apache-hivemall.html
- https://www.slideshare.net/HadoopSummit/hivemall-scalable-machine-learning-library-for-apache-hivesp...
- http://hivemall.incubator.apache.org/userguide/getting_started/permanent-functions.html
- http://hivemall.incubator.apache.org/userguide/getting_started/installation.html
- http://github.com/myui/hivemall
- http://hivemall.incubator.apache.org
set hivevar:hivemall_jar=hdfs:///apps/hivemall/hivemall-with-dependencies.jar; source /opt/demo/define-all-as-permanent.hive;
HiveMall is a scalable machine learning library built as a collection of Hive UDFs that you can run through Hive, Spark and Pig. It brings very cool processing to your Hive queries, Zeppelin, Pig and Spark code. You will be able to combine Hive Mall machine learning with stored procedures on in-memory fast LLAP Hive. This is revolutionary. You can run this via near real-time Apache NiFi streams.