Reply
Explorer
Posts: 22
Registered: ‎02-17-2015

[CDH 5.3] Hive UDF Sentry problem

I have created a simple java class as UDF

 

@UDFType(deterministic = true)
@Description(
name="bla",
value="returns hello + input",
extended="Example usage: SELECT bla('sap');"
)
class MyHelloUdf extends UDF {

public Text evaluate(Text input) {
if(input == null) return null;
return new Text("hello" + input.toString());
}
}

I build a jar using mvn assembly:single. Which created the test-1-jar-with-dependencies.jar

 

We have a CDH 5.3 cluster with Sentry and Kerberos. So I followed the Cloudera procedure:

 

On hiveserver2 create dir on local filesystem   /tmp/udfs/   and put jar here (access rights to 777)

Added this dir in  Hive Auxiliary JARs Directory, through CM.

Put the jar in my home folder on HDFS, rights 777 again.

Updated the sentry-provider.ini although I am an 'admin'.

 

CREATE FUNCTION bla AS 'MyHelloUdf' USING JAR 'hdfs:///user/alexanderbij/test-1-jar-with-dependencies.jar';

 

On HiveServer2 log i see:

2015-02-26 13:43:42,077 INFO org.apache.hadoop.hive.ql.Driver: Starting command: CREATE FUNCTION bla AS 'MyHelloUdf' USING JAR 'hdfs:///user/alexanderbij/test-udfs-1-jar-with-
dependencies.jar'
...
2015-02-26 13:50:36,716 INFO SessionState: converting to local hdfs:///user/alexanderbij/test-udfs-1-jar-with-dependencies.jar
2015-02-26 13:50:36,732 INFO SessionState: Added /tmp/5d79e1db-7099-453b-82c4-351d07bd4d49_resources/test-udfs-1-jar-with-dependencies.jar to class path
2015-02-26 13:50:36,732 INFO SessionState: Added resource: /tmp/5d79e1db-7099-453b-82c4-351d07bd4d49_resources/test-udfs-1-jar-with-dependencies.jar

 

When I query:

describe function extended bla;

I get results from the annotation @Description, that works, the jar seems to be available.

 

But when I want to use the function:

 

select bla('helpme');

 

HiveServer2 log:

015-02-26 14:20:47,827 INFO hive.ql.Context: New scratch dir is hdfs://master01.paymentslab.int:8020/tmp/hive-hive/hive_2015-02-26_14-20-47_680_3568001753207544409-1
2015-02-26 14:20:47,830 INFO org.apache.hadoop.hive.ql.parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis
2015-02-26 14:20:47,831 INFO hive.ql.Context: New scratch dir is hdfs://master01.paymentslab.int:8020/tmp/hive-hive/hive_2015-02-26_14-20-47_680_3568001753207544409-1
2015-02-26 14:20:48,100 ERROR org.apache.hadoop.hive.ql.Driver: FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments ''helpme'': The UDF implementation class 'MyHelloUdf' is no
t present in the class path
org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:7 Wrong arguments ''helpme'': The UDF implementation class 'MyHelloUdf' is not present in the class path
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:1136)
at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:132)
at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory.genExprNode(TypeCheckProcFactory.java:184)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:9752)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:9708)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3348)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3144)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:8338)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8293)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9124)
at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9377)
at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:206)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:437)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1026)
at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1019)
at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:100)
at org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:173)
at org.apache.hive.service.cli.session.HiveSessionImpl.runOperationWithLogCapture(HiveSessionImpl.java:715)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:370)
at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:357)
at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:237)
at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:392)

 

I did not copy the jar to any worker nodes, only on HiveServer2 and on HDFS.

 

What did I miss?

Cloudera Employee
Posts: 16
Registered: ‎01-07-2014

Re: [CDH 5.3] Hive UDF Sentry problem

Hi,

 

Does it work without Sentry enabled ?

 

thanks,

Mohit

 

Explorer
Posts: 22
Registered: ‎02-17-2015

Re: [CDH 5.3] Hive UDF Sentry problem

I'm not in the position to try that at the moment, but I'll ask my colleags.

Highlighted
Explorer
Posts: 10
Registered: ‎07-04-2016

Re: [CDH 5.3] Hive UDF Sentry problem

Hi ,

 

After copying jar to hs2 local directories (Aux dir path) and granting 777 to local and hdfs URI from sentry ,UDF creation didnt work .Full hive service restart was ateempted and after that UDF worked.

Explorer
Posts: 22
Registered: ‎02-17-2015

Re: [CDH 5.3] Hive UDF Sentry problem

Thanks, although I posted this more than 1y ago. I assume its working properly now.

The docs explain the process very clairly:
http://www.cloudera.com/documentation/enterprise/5-6-x/topics/cm_mc_hive_udf.html

At the bottom, a restart from HS2 is mandatory.