<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: [Hive] table partitioned in parquet giving error that it stored in HiveFileFormat in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Hive-table-partitioned-in-parquet-giving-error-that-it/m-p/355226#M237078</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/100867"&gt;@ditmarh&lt;/a&gt;&amp;nbsp;this might not work in scenarios where the table &lt;STRONG&gt;schema.table&lt;/STRONG&gt; is created from Hive, and we are appending to it from Spark.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You may try the following command, replacing saveAsTable with insertInto.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;df.write.mode("append").format("parquet").insertInto("schema.table")&lt;/LI-CODE&gt;</description>
    <pubDate>Tue, 18 Oct 2022 19:56:32 GMT</pubDate>
    <dc:creator>smruti</dc:creator>
    <dc:date>2022-10-18T19:56:32Z</dc:date>
    <item>
      <title>[Hive] table partitioned in parquet giving error that it stored in HiveFileFormat</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Hive-table-partitioned-in-parquet-giving-error-that-it/m-p/353952#M236828</link>
      <description>&lt;P&gt;table is `HiveFileFormat`. It doesn't match the specified format `ParquetFileFormat`&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have this issue when I try to write using pyspark with the following command:&lt;/P&gt;&lt;P&gt;df.write.mode("append").format("parquet").saveAsTable("schema.table")&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Before you say change from parquet to hive i know it works. But the thing is the table is partitioned in parquet and I really don't know why not its not working any more. It worked fine until now. The same command ran correctly for 1 month and 5 times so far. But today it does not want to write like this any more.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If i check the metadata it also points to everything being in parquet:&lt;/P&gt;&lt;P&gt;``&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;103&lt;/TD&gt;&lt;TD&gt;SerDe&amp;nbsp;Library:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe&lt;/TD&gt;&lt;TD&gt;NULL&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;104&lt;/TD&gt;&lt;TD&gt;InputFormat:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat&lt;/TD&gt;&lt;TD&gt;NULL&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;105&lt;/TD&gt;&lt;TD&gt;OutputFormat:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/TD&gt;&lt;TD&gt;org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;``&lt;/P&gt;</description>
      <pubDate>Tue, 04 Oct 2022 09:49:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Hive-table-partitioned-in-parquet-giving-error-that-it/m-p/353952#M236828</guid>
      <dc:creator>ditmarh</dc:creator>
      <dc:date>2022-10-04T09:49:43Z</dc:date>
    </item>
    <item>
      <title>Re: [Hive] table partitioned in parquet giving error that it stored in HiveFileFormat</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Hive-table-partitioned-in-parquet-giving-error-that-it/m-p/355226#M237078</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/100867"&gt;@ditmarh&lt;/a&gt;&amp;nbsp;this might not work in scenarios where the table &lt;STRONG&gt;schema.table&lt;/STRONG&gt; is created from Hive, and we are appending to it from Spark.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You may try the following command, replacing saveAsTable with insertInto.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;df.write.mode("append").format("parquet").insertInto("schema.table")&lt;/LI-CODE&gt;</description>
      <pubDate>Tue, 18 Oct 2022 19:56:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Hive-table-partitioned-in-parquet-giving-error-that-it/m-p/355226#M237078</guid>
      <dc:creator>smruti</dc:creator>
      <dc:date>2022-10-18T19:56:32Z</dc:date>
    </item>
    <item>
      <title>Re: [Hive] table partitioned in parquet giving error that it stored in HiveFileFormat</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Hive-table-partitioned-in-parquet-giving-error-that-it/m-p/364751#M239223</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/82698"&gt;@smruti&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for the reply. I forgot to post this but I also figured out that what you mentioned above is the actual problem the table was created in hive and as a result can not be modified by a spark instance.&lt;/P&gt;</description>
      <pubDate>Mon, 27 Feb 2023 15:29:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Hive-table-partitioned-in-parquet-giving-error-that-it/m-p/364751#M239223</guid>
      <dc:creator>ditmarh</dc:creator>
      <dc:date>2023-02-27T15:29:42Z</dc:date>
    </item>
  </channel>
</rss>

