Support Questions
Find answers, ask questions, and share your expertise

Config settings set in hive (via ambari) are applied in beeline but not Zeppelin/PySpark?



I set the hive.mapred.supports.subdirectories property to true in hive ambari config. When I now go to beeline and run a SELECT on a table that is feeding from HDFS data in subdirectories it works as expected but when I run the same query in Zeppelin using the Spark interpreter fails to return because it thinks hive.mapred.supports.subdirectories = false.

Why isn't the Spark (SQL) interpreter in Zeppelin picking up the correct hive config set in ambari?

Any insights much appreciated,



@MPH Can you try using spark-shell and see if this is working?


@Kshitij Badani I've tried in the spark-shell and it also fails because hive.mapred.supports.subdirectories = false. So the question is now why does beeline take the config from ambari but not the spark-shell?

Expert Contributor

Could you try the following?

spark-sql --hiveconf hive.mapred.supports.subdirectories=true

Expert Contributor

In the Ambari, visit `Spark` -> `Configs` -> `Custom spark-hive-site-override` and add there for Spark.

The spark-shell works like the following.
scala> sql("set hive.mapred.supports.subdirectories").show(false)
|key                                |value|
|hive.mapred.supports.subdirectories|true |

Expert Contributor

When connecting via beeline did that use HiveServer2 or SparkThriftServer?

; ;