We are struggling
to get SparkR to work together with Hive LLAP in HDP 3.0. In the https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/integrating-hive/content/hive_hivewarehouse...
documentation, there is only talk about python and Java. Nothing about R. Is
there not support for R anymore? And if so, what other options do we have as we
have a lot of SparkR code accessing Hive today.
The problem
I get is that the SparkR code tries to access the HDFS files for the database
and tables in Hive, as if LLAP connector is not there at all. So basically, we
get the following
Caused
by: org.apache.spark.sql.AnalysisException:
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table
addresstype. java.security.AccessControlException: Permission denied: user=<username>,
access=EXECUTE, inode="/apps/hive":hdfs:hdfs:drwx------
And that is
correct. The user I’m running with don’t have, and shouldn’t have permissions
to that folder. Is anybody using sparkr together with Hive LLAP in HDP3.0?