When I run a pyspark command to access a hive table I have to explicitly set my hive configurations first (e.g. mapreduce.input.fileinputformat.input.dir.recursive=true) otherwise the command fails. But I have already set these in hive-site through ambari.
Its as if spark is reading an old version ?
In /etc/spark/conf/hive-site.xml it only specifies the following:
Yet in the hive-site.xml configured through the hive ambari UI I have many other properties defined. Why does ambari not reflect the hive-site.xml configuration settings in /etc/spark/hive-site.xml? so that PySpark can set these when creating the sqlContext (aka hiveContext).