Created 01-05-2016 03:29 PM
I'm experimenting with Groovy scripts as custom UDFs in Hive and I noticed that I can't use the same syntax in beeline as in hive shell for executing custom UDFs. Is it a supported feature and syntax is different or is it not supported altogether?
The following works as is in hive shell, in beeline it throws error
compile `import org.apache.hadoop.hive.ql.exec.UDF \;
import groovy.json.JsonSlurper \;
import org.apache.hadoop.io.Text \;
public class JsonExtract extends UDF {
public int evaluate(Text a){
def jsonSlurper = new JsonSlurper() \;
def obj = jsonSlurper.parseText(a.toString())\;
return obj.val1\;
}
} ` AS GROOVY NAMED json_extract.groovy;
hive> CREATE TEMPORARY FUNCTION json_extract as 'JsonExtract';
hive> select json_extract('{"val1": 2}') from date_dim limit 1;
select json_extract('{"val1": 2}') from date_dim limit 1
OK
2
Created 01-06-2016 12:30 AM
This is definitely a beeline bug of quoting in the hive-1.2.x branch - https://github.com/apache/hive/commit/36f7ed781271... I tried this in the latest builds and it worked (though needs to collapse code into a 1 line compile command), but does not work with old beeline + new HS2.
Beeline version 2.1.0-SNAPSHOT by Apache Hive 0: jdbc:hive2://localhost:10003> compile `import org.apache.hadoop.hive.ql.exec.UDF \; import groovy.json.JsonSlurper \; import org.apache.hadoop.io.Text \; public class JsonExtract extends UDF { public int evaluate(Text a){ def jsonSlurper = new JsonSlurper() \; def obj = jsonSlurper.parseText(a.toString())\; return obj.val1\; } } ` AS GROOVY NAMED json_extract.groovy;
No rows affected (1.092 seconds)
0: jdbc:hive2://localhost:10003> CREATE TEMPORARY FUNCTION json_extract as 'JsonExtract';
No rows affected (1.421 seconds)
0: jdbc:hive2://localhost:10003> 0: jdbc:hive2://localhost:10003>
Created 01-06-2016 12:30 AM
This is definitely a beeline bug of quoting in the hive-1.2.x branch - https://github.com/apache/hive/commit/36f7ed781271... I tried this in the latest builds and it worked (though needs to collapse code into a 1 line compile command), but does not work with old beeline + new HS2.
Beeline version 2.1.0-SNAPSHOT by Apache Hive 0: jdbc:hive2://localhost:10003> compile `import org.apache.hadoop.hive.ql.exec.UDF \; import groovy.json.JsonSlurper \; import org.apache.hadoop.io.Text \; public class JsonExtract extends UDF { public int evaluate(Text a){ def jsonSlurper = new JsonSlurper() \; def obj = jsonSlurper.parseText(a.toString())\; return obj.val1\; } } ` AS GROOVY NAMED json_extract.groovy;
No rows affected (1.092 seconds)
0: jdbc:hive2://localhost:10003> CREATE TEMPORARY FUNCTION json_extract as 'JsonExtract';
No rows affected (1.421 seconds)
0: jdbc:hive2://localhost:10003> 0: jdbc:hive2://localhost:10003>
Created 01-06-2016 12:40 AM
Glad that's its not an abandoned feature. Are there more examples and/or docs available? I created a few of my own but I think we need better examples. Thank you @gopal