question Re: Set hive parameter in sparksql? in Archives of Support Questions (Read Only)

Set hive parameter in sparksql?

sunile_manjee — Thu, 26 May 2016 22:03:14 GMT

How do I set parameters for hive in sparksql context? For example I have a hive table which I want to query from sparksql. I want to set the following parameter

mapred.input.dir.recursive=true

To read all directories recursively. How to set this in spark context?

Re: Set hive parameter in sparksql?

ravi1 — Thu, 26 May 2016 22:07:58 GMT

Try setting on SparkContext like below. This works for file loads, and I believe should work for hive table load as well

sc.hadoopConfiguration.set("mapreduce.input.fileinputformat.input.dir.recursive","true")

Re: Set hive parameter in sparksql?

jyadav — Thu, 26 May 2016 22:10:38 GMT

@Sunile Manjee

Can you please try this?

sqlContext.setConf("mapred.input.dir.recursive","true")

sqlContext.setConf("mapreduce.input.fileinputformat.input.dir.recursive","true")

Re: Set hive parameter in sparksql?

bmathew — Thu, 26 May 2016 23:07:33 GMT

@Sunile Manjee - Below is some sections from working PySpark code. Notice how I set SparkConf with specific settings and then later in my code I execute Hive statements. In those Hive statements you could do: sql = "set mapred.input.dir.recursive=true"

sqlContext.sql(sql)

Here is my SparkConf:

conf = (SparkConf()

.setAppName(“ucs_data_profiling")

.set("spark.executor.instances", “50”)

.set("spark.executor.cores", 4)

.set("spark.driver.memory", “2g")

.set("spark.executor.memory", “6g")

.set("spark.dynamicAllocation.enabled", “false”)

.set("spark.shuffle.service.enabled", "true")

.set("spark.io.compression.codec", "snappy")

.set("spark.shuffle.compress", "true"))

sc = SparkContext(conf = conf)

sqlContext = HiveContext(sc)

## the rest of code parses files and converts to SchemaRDD

## lines of code etc........

## here i set some hive properties before I load my data into a hive table ## i have more HiveQL statements, i just show one here to demonstrate that this will work

sqlContext.sql(sql)

sql = """

set hive.exec.dynamic.partition.mode=nonstrict

"""

Re: Set hive parameter in sparksql?

raghavcomp3 — Mon, 04 Dec 2017 17:25:54 GMT

I'm still facing the issue. Can anyone help?

Re: Set hive parameter in sparksql?

dhavalmodi24 — Wed, 03 Jan 2018 16:06:59 GMT

I am also facing the same issue.

Re: Set hive parameter in sparksql?

hrushikesh_iitb — Thu, 05 Jul 2018 07:01:50 GMT

https://spark.apache.org/docs/latest/configuration.html#custom-hadoophive-configuration

use spark config key as spark.hadoop.*