Support Questions
Find answers, ask questions, and share your expertise

Sparklyr with Hive LLAP - Only see 'default' database



For a long time, we have been running Hive LLAP together with sparklyr and it’s been working fine. Today, we upgraded to HDP 2.6.5 and from now on, we can’t connect to Hive through sparklyr. SparkR works fine with Hive LLAP. Easiest way to see the problem is to just show all databases in Hive. After the upgrade, we only see the “default” database. If I try to list the tables in it, it returns an empty list. There is no errors, stacktraces or anything else that can point at the problem.

The code we are running that worked before the upgrade to HDP 2.6.5 is the following

Sys.setenv(HADOOP_CONF_DIR = "/etc/hadoop/conf")
Sys.setenv(HIVE_CONF_DIR = "/etc/hive/conf")
.config <- spark_config()
.config <- c(.config, list("spark.executor.memory"="2688M",
sc <- spark_connect(master = "yarn-client",
                    app_name = "sparklyr-test",
                    config = .config)
DBI::dbGetQuery(sc, 'show databases')

Anybody got any information that can help us solve this problem?


@Berry Österlund Sounds like authorization may be getting in your way after upgrade. Is Hive LLAP configured with Ranger?

@Berry Österlund please let me know if you have any updates on this one.


Yes, Authorization is with Ranger

Everything else is working fine in the cluster. Ranger with Spark + LLAP works fine. Zeppelin with R/Python + Spark + Livy + LLAP + Ranger is working fine. Only thing after the upgrade that is not working is the sparklyr problem we have. So I dont think the problem we have are related to Authorization.


@Berry Österlund I think you should check on Ranger Admin UI > Audit for the user access entry and check what policy is granting permission only to default database?

@Berry Österlund

Did you figure this out?

; ;