<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question spark.sql.sources.partitionOverwriteMode=dynamic&amp;quot; not working in CDP 7.1.4 in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/305609#M222519</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We are using s&lt;FONT size="2"&gt;park.sql.sources.partitionOverwriteMode=dynamic"&lt;/FONT&gt; in our pyspark scripts in our CDH 6.3.2 cluster with spark version 2.4.0, but when we are trying it to CDP 7.1.4 with Spark 2.4.0 version and it is not working, is there anyway to have the config s&lt;FONT size="2"&gt;park.sql.sources.partitionOverwriteMode=dynamic"&lt;/FONT&gt; work in CDP? is there any alternatives on it?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Just to highlight that both our CDP 7.1.4 and CDH 6.3.2 clusters are having the same Spark version of 2.4.0&lt;/P&gt;</description>
    <pubDate>Mon, 09 Nov 2020 08:03:44 GMT</pubDate>
    <dc:creator>Mondi</dc:creator>
    <dc:date>2020-11-09T08:03:44Z</dc:date>
    <item>
      <title>spark.sql.sources.partitionOverwriteMode=dynamic" not working in CDP 7.1.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/305609#M222519</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We are using s&lt;FONT size="2"&gt;park.sql.sources.partitionOverwriteMode=dynamic"&lt;/FONT&gt; in our pyspark scripts in our CDH 6.3.2 cluster with spark version 2.4.0, but when we are trying it to CDP 7.1.4 with Spark 2.4.0 version and it is not working, is there anyway to have the config s&lt;FONT size="2"&gt;park.sql.sources.partitionOverwriteMode=dynamic"&lt;/FONT&gt; work in CDP? is there any alternatives on it?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Just to highlight that both our CDP 7.1.4 and CDH 6.3.2 clusters are having the same Spark version of 2.4.0&lt;/P&gt;</description>
      <pubDate>Mon, 09 Nov 2020 08:03:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/305609#M222519</guid>
      <dc:creator>Mondi</dc:creator>
      <dc:date>2020-11-09T08:03:44Z</dc:date>
    </item>
    <item>
      <title>Re: spark.sql.sources.partitionOverwriteMode=dynamic" not working in CDP 7.1.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/306414#M222834</link>
      <description>&lt;P&gt;I'm having this same issue whether I specify this config in the spark-defaults.conf via Cloudera Manager for CDP 7.1.4 or inline in my spark.write.option(&lt;SPAN&gt;"partitionOverwriteMode", "dynamic"&lt;/SPAN&gt;).&lt;/P&gt;&lt;P&gt;Error message is:&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;java.io.IOException: PathOutputCommitProtocol does not support dynamicPartitionOverwrite&lt;/PRE&gt;</description>
      <pubDate>Tue, 24 Nov 2020 17:18:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/306414#M222834</guid>
      <dc:creator>LoganATX</dc:creator>
      <dc:date>2020-11-24T17:18:05Z</dc:date>
    </item>
    <item>
      <title>Re: spark.sql.sources.partitionOverwriteMode=dynamic" not working in CDP 7.1.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/306422#M222838</link>
      <description>&lt;P&gt;I was able to fix this in our CDP 7.1.4 cluster today by disabling the&lt;BR /&gt;Enable Optimized S3 Committers -&amp;nbsp;spark.cloudera.s3_committers.enabled&lt;/P&gt;&lt;P&gt;in the Spark Service Configuration&lt;BR /&gt;This works for me because we are using HDFS on premise. If you are using S3, I'm guessing that this is put in place because of the S3 eventual consistency issues.&lt;/P&gt;&lt;P&gt;I've then also added the&amp;nbsp;&lt;STRONG&gt;&lt;SPAN&gt;spark.sql.sources.partitionOverwriteMode=dynamic&lt;/SPAN&gt;&lt;/STRONG&gt; setting to my spark-defaults.conf also in Spark Service Configuration via the Safety Valve settings.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Nov 2020 21:39:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/306422#M222838</guid>
      <dc:creator>LoganATX</dc:creator>
      <dc:date>2020-11-24T21:39:35Z</dc:date>
    </item>
    <item>
      <title>Re: spark.sql.sources.partitionOverwriteMode=dynamic" not working in CDP 7.1.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/332324#M231133</link>
      <description>&lt;P&gt;It works also for me using CDP 7.1.7&lt;/P&gt;&lt;P&gt;Thank you&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 15 Dec 2021 11:24:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/332324#M231133</guid>
      <dc:creator>Enri</dc:creator>
      <dc:date>2021-12-15T11:24:22Z</dc:date>
    </item>
    <item>
      <title>Re: spark.sql.sources.partitionOverwriteMode=dynamic" not working in CDP 7.1.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/343483#M233945</link>
      <description>&lt;P&gt;I was getting the same error after Cloudera upgradation while using insert overwrite with config&amp;nbsp;&lt;STRONG&gt;&lt;SPAN&gt;spark.sql.sources.partitionOverwriteMode=dynamic.&amp;nbsp;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;SPAN&gt;For me below&amp;nbsp;config property resolved the issue.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;"spark.sql.sources.commitProtocolClass=org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol"&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Since after upgrade default value was "spark.sql.sources.commitProtocolClass= org.apache.spark.internal.io.cloud.PathOutputCommitProtocol" and it was creating issue.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 09 May 2022 16:56:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/343483#M233945</guid>
      <dc:creator>Mukund023</dc:creator>
      <dc:date>2022-05-09T16:56:34Z</dc:date>
    </item>
    <item>
      <title>Re: spark.sql.sources.partitionOverwriteMode=dynamic" not working in CDP 7.1.4</title>
      <link>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/348807#M235455</link>
      <description>&lt;P&gt;Hi Team,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;CDP uses the "&lt;STRONG&gt;org.apache.spark.internal.io.cloud.PathOutputCommitProtocol&lt;/STRONG&gt;" OutputCommitter which &lt;STRONG&gt;does not support dynamicPartitionOverwrite&lt;/STRONG&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can set the following parameters into your spark job.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;code level:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;spark.conf.set("spark.sql.sources.partitionOverwriteMode", "dynamic")
spark.conf.set("spark.sql.parquet.output.committer.class", "org.apache.parquet.hadoop.ParquetOutputCommitter")
spark.conf.set("spark.sql.sources.commitProtocolClass", "org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol")&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;spark-submit/spark-shell:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;--conf spark.sql.sources.partitionOverwriteMode=dynamic&lt;BR /&gt;--conf spark.sql.parquet.output.committer.class=org.apache.parquet.hadoop.ParquetOutputCommitter &lt;BR /&gt;--conf spark.sql.sources.commitProtocolClass=org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; If you are using S3, you can disable it by specifying&amp;nbsp;&lt;STRONG&gt;spark.cloudera.s3_committers.enabled &lt;/STRONG&gt;parameter&lt;STRONG&gt;.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;--conf&amp;nbsp;spark.cloduera.s3_committers.enabled=false&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jul 2022 11:24:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/spark-sql-sources-partitionOverwriteMode-dynamic-quot-not/m-p/348807#M235455</guid>
      <dc:creator>RangaReddy</dc:creator>
      <dc:date>2022-07-26T11:24:06Z</dc:date>
    </item>
  </channel>
</rss>

