Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HPL-SQL : Function creation issues in Hive 2.0

avatar
When running with hplsql ,below two scenarios are failing.My goal is to create a function and test it.
We are using Hive 2.0 version.
************************************************************************
Scenario 1 : 
************************************************************************
Here is the function which I'm trying to create and is failing.
dg_1 and dg_2 tables are already created in hive, by using the above function we are trying to create dg_join3 table and populate some data.
If you run these queries one by one using hplsql then they are created in Hive and are visible.
CREATE FUNCTION my_function()
RETURNS STRING
BEGIN
DROP TABLE default.dg_join3 ;
CREATE TABLE default.dg_join3 (name string, phone string) ;
INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a,default.dg_2 b where a.id = b.id ;
SELECT * FROM default.dg_join3 ;
END;
************************************************************************
Scenario 2 : 
************************************************************************
hplsql -e "INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a join default.dg_2 b on a.id = b.id ;"
************************************************************************
Observations : 
If you run these queries one by one using hplsql then tables are created in Hive and are visible.
hplsql -e "DROP TABLE default.dg_join3 ;"
hplsql -e "CREATE TABLE default.dg_join3 (name string, phone string) ;"
hplsql -e "SELECT * FROM default.dg_join3 ;"
************************************************************************
Log of scneario 2 : 
************************************************************************
hplsql -e "INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a join default.dg_2 b on a.id = b.id ;"
WARNING: Use "yarn jar" to launch YARN applications.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/appdata/hwx/apache-hive-2.0.0-bin/lib/hive-jdbc-2.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/appdata/hwx/apache-hive-2.0.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/appdata/hwx/hdp/2.4.2.0-258/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Open connection: jdbc:hive2://I-P:10000 (189 ms)
Starting SQL statement
Unhandled exception in HPL/SQL
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 1:50 Invalid table alias or column reference 'b': (possible column names are: id, name)
        at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:258)
        at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:244)
        at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:247)
        at org.apache.hive.hplsql.Conn.executeSql(Conn.java:110)
        at org.apache.hive.hplsql.Exec.executeSql(Exec.java:590)
        at org.apache.hive.hplsql.Stmt.insertSelect(Stmt.java:570)
        at org.apache.hive.hplsql.Stmt.insert(Stmt.java:556)
        at org.apache.hive.hplsql.Exec.visitInsert_stmt(Exec.java:1094)
        at org.apache.hive.hplsql.Exec.visitInsert_stmt(Exec.java:51)
        at org.apache.hive.hplsql.HplsqlParser$Insert_stmtContext.accept(HplsqlParser.java:8154)
        at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
        at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:987)
        at org.apache.hive.hplsql.Exec.visitStmt(Exec.java:51)
        at org.apache.hive.hplsql.HplsqlParser$StmtContext.accept(HplsqlParser.java:891)
        at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
        at org.apache.hive.hplsql.HplsqlBaseVisitor.visitBlock(HplsqlBaseVisitor.java:28)
        at org.apache.hive.hplsql.HplsqlParser$BlockContext.accept(HplsqlParser.java:412)
        at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:70)
        at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:895)
        at org.apache.hive.hplsql.Exec.visitProgram(Exec.java:51)
       at org.apache.hive.hplsql.HplsqlParser$ProgramContext.accept(HplsqlParser.java:355)
        at org.antlr.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:42)
        at org.apache.hive.hplsql.Exec.run(Exec.java:753)
        at org.apache.hive.hplsql.Exec.run(Exec.java:729)
        at org.apache.hive.hplsql.Hplsql.main(Hplsql.java:23)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 1:50 Invalid table alias or column reference 'b': (possible column names are: id, name)
        at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315)
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:112)
        at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:181)
        at org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:419)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:406)
        at sun.reflect.GeneratedMethodAccessor110.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
        at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
        at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
       at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
        at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
        at com.sun.proxy.$Proxy20.executeStatementAsync(Unknown Source)
        at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:276)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1317)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1302)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:50 Invalid table alias or column reference 'b': (possible column names are: id, name)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10537)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10485)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3822)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3601)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8965)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8920)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9765)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9658)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10131)
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:329)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10142)
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:211)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:459)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:316)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1189)
        at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1183)
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:110)
        ... 26 more
1 ACCEPTED SOLUTION

avatar

Below is the code and output from function:-

CREATE FUNCTION my_function() RETURNS STRING BEGIN DROP TABLE default.dg_join3 ; CREATE TABLE default.dg_join3 (name string, phone string) ; INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a,default.dg_2 b where a.id = b.id ; SELECT * FROM default.dg_join3 ; END; -- Invoke the function PRINT 'world, come with me now'; PRINT my_function();

[root@sandbox hplsql-0.3.17]# ./hplsql -f my_function.txt world, come with me now 16/06/28 09:34:15 INFO jdbc.Utils: Supplied authorities: localhost:10000 16/06/28 09:34:15 INFO jdbc.Utils: Resolved authority: localhost:10000 Open connection: jdbc:hive2://localhost:10000 (414 ms) Starting SQL statement SQL statement executed successfully (387 ms) Starting SQL statement SQL statement executed successfully (1.29 sec) Starting SQL statement SQL statement executed successfully (9.85 sec) Starting query Query executed successfully (880 ms) null

View solution in original post

12 REPLIES 12

avatar

Could you please verify your create table statement of dg_1 and dg_2?

As below statement that executed successfully at my end without any issue.

[root@sandbox hplsql-0.3.17]# ./hplsql -e "INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a join default.dg_2 b on a.id = b.id ;"

16/06/28 09:01:13 INFO jdbc.Utils: Supplied authorities: localhost:10000

16/06/28 09:01:13 INFO jdbc.Utils: Resolved authority: localhost:10000

Open connection: jdbc:hive2://localhost:10000 (438 ms) Starting

SQL statement SQL statement executed successfully (11.86 sec)

avatar

Below is the code and output from function:-

CREATE FUNCTION my_function() RETURNS STRING BEGIN DROP TABLE default.dg_join3 ; CREATE TABLE default.dg_join3 (name string, phone string) ; INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a,default.dg_2 b where a.id = b.id ; SELECT * FROM default.dg_join3 ; END; -- Invoke the function PRINT 'world, come with me now'; PRINT my_function();

[root@sandbox hplsql-0.3.17]# ./hplsql -f my_function.txt world, come with me now 16/06/28 09:34:15 INFO jdbc.Utils: Supplied authorities: localhost:10000 16/06/28 09:34:15 INFO jdbc.Utils: Resolved authority: localhost:10000 Open connection: jdbc:hive2://localhost:10000 (414 ms) Starting SQL statement SQL statement executed successfully (387 ms) Starting SQL statement SQL statement executed successfully (1.29 sec) Starting SQL statement SQL statement executed successfully (9.85 sec) Starting query Query executed successfully (880 ms) null

avatar
Contributor

@Avinash P

It seems you have installed hive 2.0 on HDP ? How have you done ?

avatar

I am using HDP2.4 sandbox with HPL/SQL 0.3.17.

avatar

Hi @Avinash P. Can you please confirm that table dg_2 does have a column called 'phone'?

avatar

Thank you all for responding 🙂

@bpreachuk- Yes there is a column called "phone" in "dg_2" table. Here are the table definitions

Current Hive Version - Hive 2.0.0

DDL of the tables dg_1 and dg_2 are as follows.

create table dg_1 (id int, name text);

create table dg_2(id int, phone text);

@Mukesh Kumar The first one which you provided working flawlessly and the query which you pasted for JOIN operation is throwing below errors for us.

hplsql -e "INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a join default.dg_2 b on a.id = b.id ;"

Error while compiling statement: FAILED: SemanticException [Error 10004]: Line 1:50 Invalid table alias or column reference 'b': (possible column names are: id, name)

And we want to know what is the procedure to make the functions permanent in Hive.

avatar

Hi @Avinash P. With HPL/SQL your functions do not get stored in the Hive metastore. It behaves a little different than conventional Relational databases.

There are 3 ways to execute the functions and/or make the functions permanent in Hive.

1. Add the functions and stored procedures to the global .hplsqlrc file

2. Include the functions in your code by using the INCLUDE statement (or add INCLUDE statements to the .hplsqlrc file)

3. Have the function/stored procedure code duplicated inside each of your scripts (only for testing, never the best way).

http://www.hplsql.org/udf-sproc

avatar

Thank you Chuk for responding...We are able to register a simple function and run successfully. When creating the function my_function with two simple tables and are running into issue as provided initially.[All the DDL and function definition are provided]

Appreciate if you could you help in recreating this function in your lab and share the results.

avatar

Hi @Avinash . Sorry for the delay in responding. I was able to do what you were attempting - please see below for explanation.

I created the 2 tables (dg_1 and dg_2) and loaded them with data. By the way they needed to use datatype string - text won't work.

create table dg_1 (id int, name string);
create table dg_2(id int, phone string);
insert into dg_1 values (1, 'Bill P');
insert into dg_2 values (1, '952-555-5555');

Now, you were attempting to create a function that processes these 2 tables and drops/inserts data into a 3rd table. Functions aren't really designed for this type of processing. Functions are User-Defined Functions (UDFs) - designed to do simple processing & return a value. Designed to be executed as part of a SQL expression. http://www.hplsql.org/create-function

You should accomplish your task with a stored procedure instead. Here is a code sample that I just run inline (for ease of development).

CREATE procedure load_dg_3
as 
BEGIN
  DROP TABLE if exists default.dg_join3 ;
  CREATE TABLE default.dg_join3 (name string, phone string) ;
  INSERT INTO TABLE default.dg_join3 SELECT a.name, b.phone from default.dg_1 a,default.dg_2 b where a.id = b.id ;
  SELECT * FROM default.dg_join3 ;
  print 'Completed...'; 
END;  

execute load_dg_3;

> hplsql -f "load_dg_3"

And the output.

16/06/30 22:15:11 INFO jdbc.Utils: Supplied authorities: localhost:10000
16/06/30 22:15:11 INFO jdbc.Utils: Resolved authority: localhost:10000
Open connection: jdbc:hive2://localhost:10000 (233 ms)
Starting SQL statement
SQL statement executed successfully (x.xx sec)
Starting SQL statement
SQL statement executed successfully (x.xx sec)
Starting SQL statement
SQL statement executed successfully (x.xx sec)
Starting query
Query executed successfully (xx ms)
Bill P	952-555-5555
Completed...

That should be what you're looking for. Hope this helps.