<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: when inserting data from hive parquet table with partition to another parquet table with partition, exception : hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available, is thrown. in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156141#M20774</link>
    <description>&lt;P&gt;Why not use ORC, whats the use case that it requires parquet?&lt;/P&gt;</description>
    <pubDate>Wed, 24 Feb 2016 20:06:18 GMT</pubDate>
    <dc:creator>aervits</dc:creator>
    <dc:date>2016-02-24T20:06:18Z</dc:date>
    <item>
      <title>when inserting data from hive parquet table with partition to another parquet table with partition, exception : hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available, is thrown.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156140#M20773</link>
      <description>&lt;P&gt;While inserting from  hive
external table P1 stored as Parquet (  partitioned on column e.g. col A ) to the
another table P2 stored as Parquet and having same number of columns as table
P1 but partitioned on different column ( e.g. Col B), hive throws Premature EOF
exception.&lt;/P&gt;&lt;P&gt;exception : hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available.&lt;/P&gt;&lt;P&gt;Any idea the cause of issue.&lt;/P&gt;&lt;P&gt;HDP2.3 cluster with 4 datanodes. Process is running with sufficient map memory and container size.&lt;/P&gt;&lt;P&gt;I have tried with running with TEZ as well as Mapreduce. But same error.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Harshal&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 19:49:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156140#M20773</guid>
      <dc:creator>harshaldj</dc:creator>
      <dc:date>2016-02-24T19:49:09Z</dc:date>
    </item>
    <item>
      <title>Re: when inserting data from hive parquet table with partition to another parquet table with partition, exception : hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available, is thrown.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156141#M20774</link>
      <description>&lt;P&gt;Why not use ORC, whats the use case that it requires parquet?&lt;/P&gt;</description>
      <pubDate>Wed, 24 Feb 2016 20:06:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156141#M20774</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-24T20:06:18Z</dc:date>
    </item>
    <item>
      <title>Re: when inserting data from hive parquet table with partition to another parquet table with partition, exception : hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available, is thrown.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156142#M20775</link>
      <description>&lt;P&gt;So you mean this is something to do with Parquet ? Parquet has good integration with Spark .&lt;/P&gt;</description>
      <pubDate>Thu, 25 Feb 2016 13:26:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156142#M20775</guid>
      <dc:creator>harshaldj</dc:creator>
      <dc:date>2016-02-25T13:26:40Z</dc:date>
    </item>
    <item>
      <title>Re: when inserting data from hive parquet table with partition to another parquet table with partition, exception : hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available, is thrown.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156143#M20776</link>
      <description>&lt;P&gt;This might be a Parquet problem, but could also be something else. I have seen some performance and job issues when using Parquet instead of ORC. Have you seen this  &lt;A href="https://issues.apache.org/jira/browse/HDFS-8475"&gt;https://issues.apache.org/jira/browse/HDFS-8475&lt;/A&gt; &lt;/P&gt;&lt;P&gt;What features are you missing regarding SparkORC?&lt;/P&gt;&lt;P&gt;I have seen you error before, but in a different context (Query on ORC table was failing)&lt;/P&gt;&lt;P&gt;Make sure your HDFS (especially the DNs) are running and healthy. It might be related to some bad blocks, so make sure the blocks that are related to your job are ok&lt;/P&gt;</description>
      <pubDate>Thu, 25 Feb 2016 14:28:23 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156143#M20776</guid>
      <dc:creator>jstraub</dc:creator>
      <dc:date>2016-02-25T14:28:23Z</dc:date>
    </item>
    <item>
      <title>Re: when inserting data from hive parquet table with partition to another parquet table with partition, exception : hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available, is thrown.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156144#M20777</link>
      <description>&lt;P&gt;Thanks for your reply Jonas ! &lt;/P&gt;&lt;P&gt;I have verified datanode health and its all fine, there are no corrupt blocks across filesystem. I will check by changing the format to ORC.&lt;/P&gt;</description>
      <pubDate>Fri, 26 Feb 2016 13:42:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156144#M20777</guid>
      <dc:creator>harshaldj</dc:creator>
      <dc:date>2016-02-26T13:42:05Z</dc:date>
    </item>
    <item>
      <title>Re: when inserting data from hive parquet table with partition to another parquet table with partition, exception : hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available, is thrown.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156145#M20778</link>
      <description>&lt;P&gt;Same exception for orc hive table as well. &lt;/P&gt;&lt;P&gt;looks like this is a generic issue for below case :&lt;/P&gt;&lt;P&gt;1. create external table T1(col A , B , C) with partition on (col A) stored as ORC . Load table with substantial data. in my case around 85 GB data.&lt;/P&gt;&lt;P&gt;2. Create external table T2(col A,B,C) with partition on (Col B) stored as ORC. Load table T2 from T1 with dynamic partition. &lt;/P&gt;&lt;P&gt;Output :- Premature EOF exception &lt;/P&gt;&lt;P&gt;Please try out !&lt;/P&gt;</description>
      <pubDate>Fri, 26 Feb 2016 15:06:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156145#M20778</guid>
      <dc:creator>harshaldj</dc:creator>
      <dc:date>2016-02-26T15:06:46Z</dc:date>
    </item>
    <item>
      <title>Re: when inserting data from hive parquet table with partition to another parquet table with partition, exception : hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available, is thrown.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156146#M20779</link>
      <description>&lt;P&gt;In case of Hive on Tez, decreasing tez.grouping.max-size might help you. I faced almost same problem before and I decreased tez.grouping.max-size from 1GB to 256MB, then that problem almost(not perfectly) solved.&lt;/P&gt;</description>
      <pubDate>Mon, 07 Nov 2016 16:22:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156146#M20779</guid>
      <dc:creator>takefumi_ohide</dc:creator>
      <dc:date>2016-11-07T16:22:21Z</dc:date>
    </item>
    <item>
      <title>Re: when inserting data from hive parquet table with partition to another parquet table with partition, exception : hdfs.DFSClient: Exception in createBlockOutputStream java.io.EOFException: Premature EOF: no length prefix available, is thrown.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156147#M20780</link>
      <description>&lt;P&gt;Thanks for the reply !!&lt;/P&gt;&lt;P&gt;Issue was resolved by increasing value for dfs.datanode.max.transfer.threads to 16000 in my case.&lt;/P&gt;&lt;P&gt;Also increasing ulimit value on each worker node.&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Harshal&lt;/P&gt;</description>
      <pubDate>Tue, 08 Nov 2016 14:53:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/when-inserting-data-from-hive-parquet-table-with-partition/m-p/156147#M20780</guid>
      <dc:creator>harshaldj</dc:creator>
      <dc:date>2016-11-08T14:53:00Z</dc:date>
    </item>
  </channel>
</rss>

