Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HPL-SQL : Function creation issues in Hive 2.0

Solved Go to solution
Highlighted

HPL-SQL : Function creation issues in Hive 2.0

When running with hplsql ,below two scenarios are failing.My goal is to create a function and test it.
We are using Hive 2.0 version.
************************************************************************
Scenario 1 : 
************************************************************************
Here is the function which I'm trying to create and is failing.
dg_1 and dg_2 tables are already created in hive, by using the above function we are trying to create dg_join3 table and populate some data.
If you run these queries one by one using hplsql then they are created in Hive and are visible.
CREATE FUNCTION my_function()
RETURNS STRING
BEGIN
DROP TABLE default.dg_join3 ;
CREATE TABLE default.dg_join3 (name string, phone string) ;
INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a,default.dg_2 b where a.id = b.id ;
SELECT * FROM default.dg_join3 ;
END;
************************************************************************
Scenario 2 : 
************************************************************************
hplsql -e "INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a join default.dg_2 b on a.id = b.id ;"
************************************************************************
Observations : 
If you run these queries one by one using hplsql then tables are created in Hive and are visible.
hplsql -e "DROP TABLE default.dg_join3 ;"
hplsql -e "CREATE TABLE default.dg_join3 (name string, phone string) ;"
hplsql -e "SELECT * FROM default.dg_join3 ;"
************************************************************************
Log of scneario 2 : 
************************************************************************
hplsql -e "INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a join default.dg_2 b on a.id = b.id ;"
WARNING: Use "yarn jar" to launch YARN applications.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/appdata/hwx/apache-hive-2.0.0-bin/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/appdata/hwx/apache-hive-2.0.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/appdata/hwx/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Open connection: jdbc:hive2://I-P:10000 (189 ms)
Starting SQL statement
Unhandled exception in HPL/SQL
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 1:50 Invalid table alias or column reference 'b': (possible column names are: id, name)
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:258)
        at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:244)
        at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:247)
        at org.apache.hive.hplsql.Conn.executeSql(Conn.java:110)
        at org.apache.hive.hplsql.Exec.executeSql(Exec.java:590)
        at org.apache.hive.hplsql.Stmt.insertSelect(Stmt.java:570)
        at org.apache.hive.hplsql.Stmt.insert(Stmt.java:556)
        at org.apache.hive.hplsql.Exec.visitInsert_stmt(Exec.java:1094)
        at org.apache.hive.hplsql.Exec.visitInsert_stmt(Exec.java:51)
        at org.apache.hive.hplsql.HplsqlParser$Insert_stmtContext.accept(HplsqlParser.java:8154)
        at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
        at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:987)
        at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:51)
        at org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:891)
        at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
        at org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28)
        at org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:412)
        at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
        at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:895)
        at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:51)
       at org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:355)
        at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42)
        at org.apache.hive.hplsql.Exec.run(Exec.java:753)
        at org.apache.hive.hplsql.Exec.run(Exec.java:729)
        at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 1:50 Invalid table alias or column reference 'b': (possible column names are: id, name)
        at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315)
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:112)
        at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
        at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:419)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:406)
        at sun.reflect.GeneratedMethodAccessor110.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
        at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
        at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
        at com.sun.proxy.$Proxy20.executeStatementAsync(Unknown Source)
        at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:276)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1317)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1302)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:50 Invalid table alias or column reference 'b': (possible column names are: id, name)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10537)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10485)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3822)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3601)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8965)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8920)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9765)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9658)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10131)
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:329)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10142)
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:211)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:459)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:316)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1189)
        at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1183)
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
        ... 26 more
1 ACCEPTED SOLUTION

Accepted Solutions

Re: HPL-SQL : Function creation issues in Hive 2.0

Below is the code and output from function:-

CREATE FUNCTION my_function() RETURNS STRING BEGIN DROP TABLE default.dg_join3 ; CREATE TABLE default.dg_join3 (name string, phone string) ; INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a,default.dg_2 b where a.id = b.id ; SELECT * FROM default.dg_join3 ; END; -- Invoke the function PRINT 'world, come with me now'; PRINT my_function();

[root@sandbox hplsql-0.3.17]# ./hplsql -f my_function.txt world, come with me now 16/06/28 09:34:15 INFO jdbc.Utils: Supplied authorities: localhost:10000 16/06/28 09:34:15 INFO jdbc.Utils: Resolved authority: localhost:10000 Open connection: jdbc:hive2://localhost:10000 (414 ms) Starting SQL statement SQL statement executed successfully (387 ms) Starting SQL statement SQL statement executed successfully (1.29 sec) Starting SQL statement SQL statement executed successfully (9.85 sec) Starting query Query executed successfully (880 ms) null

12 REPLIES 12

Re: HPL-SQL : Function creation issues in Hive 2.0

Could you please verify your create table statement of dg_1 and dg_2?

As below statement that executed successfully at my end without any issue.

[root@sandbox hplsql-0.3.17]# ./hplsql -e "INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a join default.dg_2 b on a.id = b.id ;"

16/06/28 09:01:13 INFO jdbc.Utils: Supplied authorities: localhost:10000

16/06/28 09:01:13 INFO jdbc.Utils: Resolved authority: localhost:10000

Open connection: jdbc:hive2://localhost:10000 (438 ms) Starting

SQL statement SQL statement executed successfully (11.86 sec)

Re: HPL-SQL : Function creation issues in Hive 2.0

Below is the code and output from function:-

CREATE FUNCTION my_function() RETURNS STRING BEGIN DROP TABLE default.dg_join3 ; CREATE TABLE default.dg_join3 (name string, phone string) ; INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a,default.dg_2 b where a.id = b.id ; SELECT * FROM default.dg_join3 ; END; -- Invoke the function PRINT 'world, come with me now'; PRINT my_function();

[root@sandbox hplsql-0.3.17]# ./hplsql -f my_function.txt world, come with me now 16/06/28 09:34:15 INFO jdbc.Utils: Supplied authorities: localhost:10000 16/06/28 09:34:15 INFO jdbc.Utils: Resolved authority: localhost:10000 Open connection: jdbc:hive2://localhost:10000 (414 ms) Starting SQL statement SQL statement executed successfully (387 ms) Starting SQL statement SQL statement executed successfully (1.29 sec) Starting SQL statement SQL statement executed successfully (9.85 sec) Starting query Query executed successfully (880 ms) null

Re: HPL-SQL : Function creation issues in Hive 2.0

@Avinash P

It seems you have installed hive 2.0 on HDP ? How have you done ?

Re: HPL-SQL : Function creation issues in Hive 2.0

I am using HDP2.4 sandbox with HPL/SQL 0.3.17.

Re: HPL-SQL : Function creation issues in Hive 2.0

Hi @Avinash P. Can you please confirm that table dg_2 does have a column called 'phone'?

Re: HPL-SQL : Function creation issues in Hive 2.0

Thank you all for responding :)

@bpreachuk- Yes there is a column called "phone" in "dg_2" table. Here are the table definitions

Current Hive Version - Hive 2.0.0

DDL of the tables dg_1 and dg_2 are as follows.

create table dg_1 (id int, name text);

create table dg_2(id int, phone text);

@Mukesh Kumar The first one which you provided working flawlessly and the query which you pasted for JOIN operation is throwing below errors for us.

hplsql -e "INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a join default.dg_2 b on a.id = b.id ;"

Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 1:50 Invalid table alias or column reference 'b': (possible column names are: id, name)

And we want to know what is the procedure to make the functions permanent in Hive.

Re: HPL-SQL : Function creation issues in Hive 2.0

Hi @Avinash P. With HPL/SQL your functions do not get stored in the Hive metastore. It behaves a little different than conventional Relational databases.

There are 3 ways to execute the functions and/or make the functions permanent in Hive.

1. Add the functions and stored procedures to the global .hplsqlrc file

2. Include the functions in your code by using the INCLUDE statement (or add INCLUDE statements to the .hplsqlrc file)

3. Have the function/stored procedure code duplicated inside each of your scripts (only for testing, never the best way).

http://www.hplsql.org/udf-sproc

Re: HPL-SQL : Function creation issues in Hive 2.0

Thank you Chuk for responding...We are able to register a simple function and run successfully. When creating the function my_function with two simple tables and are running into issue as provided initially.[All the DDL and function definition are provided]

Appreciate if you could you help in recreating this function in your lab and share the results.

Re: HPL-SQL : Function creation issues in Hive 2.0

Hi @Avinash . Sorry for the delay in responding. I was able to do what you were attempting - please see below for explanation.

I created the 2 tables (dg_1 and dg_2) and loaded them with data. By the way they needed to use datatype string - text won't work.

create table dg_1 (id int, name string);
create table dg_2(id int, phone string);
insert into dg_1 values (1, 'Bill P');
insert into dg_2 values (1, '952-555-5555');

Now, you were attempting to create a function that processes these 2 tables and drops/inserts data into a 3rd table. Functions aren't really designed for this type of processing. Functions are User-Defined Functions (UDFs) - designed to do simple processing & return a value. Designed to be executed as part of a SQL expression. http://www.hplsql.org/create-function

You should accomplish your task with a stored procedure instead. Here is a code sample that I just run inline (for ease of development).

CREATE procedure load_dg_3
as 
BEGIN
  DROP TABLE if exists default.dg_join3 ;
  CREATE TABLE default.dg_join3 (name string, phone string) ;
  INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a,default.dg_2 b where a.id = b.id ;
  SELECT * FROM default.dg_join3 ;
  print 'Completed...'; 
END;  

execute load_dg_3;

> hplsql -f "load_dg_3"

And the output.

16/06/30 22:15:11 INFO jdbc.Utils: Supplied authorities: localhost:10000
16/06/30 22:15:11 INFO jdbc.Utils: Resolved authority: localhost:10000
Open connection: jdbc:hive2://localhost:10000 (233 ms)
Starting SQL statement
SQL statement executed successfully (x.xx sec)
Starting SQL statement
SQL statement executed successfully (x.xx sec)
Starting SQL statement
SQL statement executed successfully (x.xx sec)
Starting query
Query executed successfully (xx ms)
Bill P	952-555-5555
Completed...

That should be what you're looking for. Hope this helps.

Don't have an account?
Coming from Hortonworks? Activate your account here