<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Set hive parameter in sparksql? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103313#M29854</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1486/smanjee.html" nodeid="1486"&gt;@Sunile Manjee&lt;/A&gt; - Below is some sections from working PySpark code.

Notice how I set SparkConf with specific settings and then later in my code I execute Hive statements. 
In those Hive statements you could do: 

sql = "set mapred.input.dir.recursive=true"&lt;/P&gt;&lt;P&gt;sqlContext.sql(sql)&lt;/P&gt;&lt;P&gt;
Here is my SparkConf:
&lt;/P&gt;&lt;P&gt;conf = (SparkConf()&lt;/P&gt;&lt;P&gt;        .setAppName(“ucs_data_profiling")&lt;/P&gt;&lt;P&gt;        .set("spark.executor.instances", “50”)&lt;/P&gt;&lt;P&gt;        .set("spark.executor.cores", 4)&lt;/P&gt;&lt;P&gt;        .set("spark.driver.memory", “2g")  &lt;/P&gt;&lt;P&gt;        .set("spark.executor.memory", “6g")&lt;/P&gt;&lt;P&gt;        .set("spark.dynamicAllocation.enabled", “false”)&lt;/P&gt;&lt;P&gt;        .set("spark.shuffle.service.enabled", "true")&lt;/P&gt;&lt;P&gt;        .set("spark.io.compression.codec", "snappy")&lt;/P&gt;&lt;P&gt;        .set("spark.shuffle.compress", "true"))&lt;/P&gt;&lt;P&gt;sc = SparkContext(conf = conf)&lt;/P&gt;&lt;P&gt;sqlContext = HiveContext(sc)&lt;/P&gt;&lt;P&gt;## the rest of code parses files and converts to SchemaRDD&lt;/P&gt;&lt;P&gt;## lines of code etc........&lt;/P&gt;&lt;P&gt;## lines of code etc........

&lt;/P&gt;&lt;P&gt;## here i set some hive properties before I load my data into a hive table
##  i have more HiveQL statements, i just show one here to demonstrate that this will work&lt;/P&gt;&lt;P&gt;sqlContext.sql(sql)&lt;/P&gt;&lt;P&gt;sql = """&lt;/P&gt;&lt;P&gt;set hive.exec.dynamic.partition.mode=nonstrict&lt;/P&gt;&lt;P&gt;"""&lt;/P&gt;</description>
    <pubDate>Thu, 26 May 2016 23:07:33 GMT</pubDate>
    <dc:creator>bmathew</dc:creator>
    <dc:date>2016-05-26T23:07:33Z</dc:date>
    <item>
      <title>Set hive parameter in sparksql?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103310#M29851</link>
      <description>&lt;P&gt;How do I set parameters for hive in sparksql context?  For example I have a hive table which I want to query from sparksql.  I want to set the following parameter&lt;/P&gt;&lt;P&gt;mapred.input.dir.recursive=true&lt;/P&gt;&lt;P&gt;To read all directories recursively.  How to set this in spark context?&lt;/P&gt;</description>
      <pubDate>Thu, 26 May 2016 22:03:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103310#M29851</guid>
      <dc:creator>sunile_manjee</dc:creator>
      <dc:date>2016-05-26T22:03:14Z</dc:date>
    </item>
    <item>
      <title>Re: Set hive parameter in sparksql?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103311#M29852</link>
      <description>&lt;P&gt;Try setting on SparkContext like below. This works for file loads, and I believe should work for hive table load as well&lt;/P&gt;&lt;P&gt;sc.hadoopConfiguration.set("mapreduce.input.fileinputformat.input.dir.recursive","true")&lt;/P&gt;</description>
      <pubDate>Thu, 26 May 2016 22:07:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103311#M29852</guid>
      <dc:creator>ravi1</dc:creator>
      <dc:date>2016-05-26T22:07:58Z</dc:date>
    </item>
    <item>
      <title>Re: Set hive parameter in sparksql?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103312#M29853</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/1486/smanjee.html" nodeid="1486"&gt;@Sunile Manjee&lt;/A&gt;&lt;P&gt;Can you please try this?&lt;/P&gt;&lt;PRE&gt;sqlContext.setConf("mapred.input.dir.recursive","true")&lt;/PRE&gt;&lt;P&gt;OR&lt;/P&gt;&lt;PRE&gt;sqlContext.setConf("mapreduce.input.fileinputformat.input.dir.recursive","true")&lt;/PRE&gt;</description>
      <pubDate>Thu, 26 May 2016 22:10:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103312#M29853</guid>
      <dc:creator>jyadav</dc:creator>
      <dc:date>2016-05-26T22:10:38Z</dc:date>
    </item>
    <item>
      <title>Re: Set hive parameter in sparksql?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103313#M29854</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1486/smanjee.html" nodeid="1486"&gt;@Sunile Manjee&lt;/A&gt; - Below is some sections from working PySpark code.

Notice how I set SparkConf with specific settings and then later in my code I execute Hive statements. 
In those Hive statements you could do: 

sql = "set mapred.input.dir.recursive=true"&lt;/P&gt;&lt;P&gt;sqlContext.sql(sql)&lt;/P&gt;&lt;P&gt;
Here is my SparkConf:
&lt;/P&gt;&lt;P&gt;conf = (SparkConf()&lt;/P&gt;&lt;P&gt;        .setAppName(“ucs_data_profiling")&lt;/P&gt;&lt;P&gt;        .set("spark.executor.instances", “50”)&lt;/P&gt;&lt;P&gt;        .set("spark.executor.cores", 4)&lt;/P&gt;&lt;P&gt;        .set("spark.driver.memory", “2g")  &lt;/P&gt;&lt;P&gt;        .set("spark.executor.memory", “6g")&lt;/P&gt;&lt;P&gt;        .set("spark.dynamicAllocation.enabled", “false”)&lt;/P&gt;&lt;P&gt;        .set("spark.shuffle.service.enabled", "true")&lt;/P&gt;&lt;P&gt;        .set("spark.io.compression.codec", "snappy")&lt;/P&gt;&lt;P&gt;        .set("spark.shuffle.compress", "true"))&lt;/P&gt;&lt;P&gt;sc = SparkContext(conf = conf)&lt;/P&gt;&lt;P&gt;sqlContext = HiveContext(sc)&lt;/P&gt;&lt;P&gt;## the rest of code parses files and converts to SchemaRDD&lt;/P&gt;&lt;P&gt;## lines of code etc........&lt;/P&gt;&lt;P&gt;## lines of code etc........

&lt;/P&gt;&lt;P&gt;## here i set some hive properties before I load my data into a hive table
##  i have more HiveQL statements, i just show one here to demonstrate that this will work&lt;/P&gt;&lt;P&gt;sqlContext.sql(sql)&lt;/P&gt;&lt;P&gt;sql = """&lt;/P&gt;&lt;P&gt;set hive.exec.dynamic.partition.mode=nonstrict&lt;/P&gt;&lt;P&gt;"""&lt;/P&gt;</description>
      <pubDate>Thu, 26 May 2016 23:07:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103313#M29854</guid>
      <dc:creator>bmathew</dc:creator>
      <dc:date>2016-05-26T23:07:33Z</dc:date>
    </item>
    <item>
      <title>Re: Set hive parameter in sparksql?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103314#M29855</link>
      <description>&lt;P&gt;I'm still facing the issue. Can anyone help?&lt;/P&gt;</description>
      <pubDate>Mon, 04 Dec 2017 17:25:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103314#M29855</guid>
      <dc:creator>raghavcomp3</dc:creator>
      <dc:date>2017-12-04T17:25:54Z</dc:date>
    </item>
    <item>
      <title>Re: Set hive parameter in sparksql?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103315#M29856</link>
      <description>&lt;P&gt;I am also facing the same issue.&lt;/P&gt;</description>
      <pubDate>Wed, 03 Jan 2018 16:06:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103315#M29856</guid>
      <dc:creator>dhavalmodi24</dc:creator>
      <dc:date>2018-01-03T16:06:59Z</dc:date>
    </item>
    <item>
      <title>Re: Set hive parameter in sparksql?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103316#M29857</link>
      <description>&lt;P&gt;&lt;A href="https://spark.apache.org/docs/latest/configuration.html#custom-hadoophive-configuration" target="_blank"&gt;https://spark.apache.org/docs/latest/configuration.html#custom-hadoophive-configuration&lt;/A&gt;&lt;/P&gt;&lt;P&gt;use spark config key as spark.hadoop.*&lt;/P&gt;</description>
      <pubDate>Thu, 05 Jul 2018 07:01:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Set-hive-parameter-in-sparksql/m-p/103316#M29857</guid>
      <dc:creator>hrushikesh_iitb</dc:creator>
      <dc:date>2018-07-05T07:01:50Z</dc:date>
    </item>
  </channel>
</rss>

