Created on 10-02-2015 04:47 PM
Sharing the steps to make Hive UDF/UDAF/UDTF to work natively with SparkSQL
1- Open spark-shell with hive udf jar as parameter:
spark-shell --jars path-to-your-hive-udf.jar
2- From spark-shell, open declare hive context and create functions
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc); sqlContext.sql("""create temporary function balance as 'com.github.gbraccialli.hive.udf.BalanceFromRechargesAndOrders'""");
3- From spark-shell, use your UDFs directly in SparkSQL:
sqlContext.sql(""" create table recharges_with_balance_array as select reseller_id, phone_number, phone_credit_id, date_recharge, phone_credit_value, balance(orders,'date_order', 'order_value', reseller_id, date_recharge, phone_credit_value) as balance from orders """);
PS: I found some issues using UDTFs with spark 1.3, which was fixed on spark 1.4.1. I tested all, UDF, UDAF and UDTF, all of them worked properly, same sql statements and same results.
Created on 11-11-2015 03:09 PM
@Guilherme Braccialli thanks for trying this. I am just starting with my Spark journey, but it seems that any time I try to do in zeppelin or jyputer i keep hitting different issues, I guess I should just stick with CLI for now. I will give demo's in zeppelin @ customer sites but will know the limitations of the product for now.
Created on 11-12-2015 08:11 PM
@Guilherme Braccialli Just tried your posted steps, everything worked great. Had problems doing mvn build using ur repo, but thats not an issue for me. I have the template idea, how to interact with hive and spark. Thanks for the post, very useful!
Created on 04-08-2016 08:14 AM
well , i tried create temporary function in beeline , and failed ; and create function, the function was created , and can be desc function. but , when i accessing it , it show can not find the function . so , do you have the same problem. I'm trying to let everyone connected to my thriftserver to have access to udf that I deployed. do you have any suggestions?
Created on 05-25-2017 07:15 PM
Hi @Guilherme Braccialli, so you did not run into this issue?
https://issues.apache.org/jira/browse/SPARK-20033
Thank you.