<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question malformed ORC file format in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/malformed-ORC-file-format/m-p/201955#M83756</link>
    <description>&lt;P&gt;here is my sqoop command .  &lt;/P&gt;&lt;PRE&gt;sqoop job -Dmapreduce.job.user.classpath.first=true --create incjob2  -- import --connect "jdbc:oracle:thin:@(description=(address=(protocol=tcp)(host=patronQA)(port=1526))(connect_data=(service_name=patron)))" --username PATRON  --incremental append --check-column INSERT_TIME --table PATRON.UFM -split-by UFM.UFMID  --hcatalog-storage-stanza "stored as orcfile" --compression-codec snappy  --target-dir /user/sami
&lt;/PRE&gt;&lt;P&gt;here is my create external table command &lt;/P&gt;&lt;PRE&gt;CREATE EXTERNAL TABLE IF NOT EXISTS ufm_orc (
..
..
 )
STORED AS ORC location '/user/sami'
&lt;/PRE&gt;&lt;P&gt;here is the error , as you can see both table input and output format is ORC &lt;/P&gt;&lt;PRE&gt;SerDe Library:          org.apache.hadoop.hive.ql.io.orc.OrcSerde
InputFormat:            org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
OutputFormat:           org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
Compressed:             No
Num Buckets:            -1
Bucket Columns:         []
Sort Columns:           []
Storage Desc Params:
        serialization.format    1
Time taken: 0.495 seconds, Fetched: 217 row(s)

    &amp;gt; select ufmid,insert_time from ufm_orc limit 10;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.io.FileFormatException: Malformed ORC file hdfs://hadoop1.tolls.dot.state.fl.us:8020/user/sami/part-m-00000.snappy. Invalid postscript.
Time taken: 0.328 seconds
&lt;/PRE&gt;</description>
    <pubDate>Sat, 22 Sep 2018 08:11:15 GMT</pubDate>
    <dc:creator>aliyesami</dc:creator>
    <dc:date>2018-09-22T08:11:15Z</dc:date>
    <item>
      <title>malformed ORC file format</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/malformed-ORC-file-format/m-p/201955#M83756</link>
      <description>&lt;P&gt;here is my sqoop command .  &lt;/P&gt;&lt;PRE&gt;sqoop job -Dmapreduce.job.user.classpath.first=true --create incjob2  -- import --connect "jdbc:oracle:thin:@(description=(address=(protocol=tcp)(host=patronQA)(port=1526))(connect_data=(service_name=patron)))" --username PATRON  --incremental append --check-column INSERT_TIME --table PATRON.UFM -split-by UFM.UFMID  --hcatalog-storage-stanza "stored as orcfile" --compression-codec snappy  --target-dir /user/sami
&lt;/PRE&gt;&lt;P&gt;here is my create external table command &lt;/P&gt;&lt;PRE&gt;CREATE EXTERNAL TABLE IF NOT EXISTS ufm_orc (
..
..
 )
STORED AS ORC location '/user/sami'
&lt;/PRE&gt;&lt;P&gt;here is the error , as you can see both table input and output format is ORC &lt;/P&gt;&lt;PRE&gt;SerDe Library:          org.apache.hadoop.hive.ql.io.orc.OrcSerde
InputFormat:            org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
OutputFormat:           org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
Compressed:             No
Num Buckets:            -1
Bucket Columns:         []
Sort Columns:           []
Storage Desc Params:
        serialization.format    1
Time taken: 0.495 seconds, Fetched: 217 row(s)

    &amp;gt; select ufmid,insert_time from ufm_orc limit 10;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.io.FileFormatException: Malformed ORC file hdfs://hadoop1.tolls.dot.state.fl.us:8020/user/sami/part-m-00000.snappy. Invalid postscript.
Time taken: 0.328 seconds
&lt;/PRE&gt;</description>
      <pubDate>Sat, 22 Sep 2018 08:11:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/malformed-ORC-file-format/m-p/201955#M83756</guid>
      <dc:creator>aliyesami</dc:creator>
      <dc:date>2018-09-22T08:11:15Z</dc:date>
    </item>
    <item>
      <title>Re: malformed ORC file format</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/malformed-ORC-file-format/m-p/201956#M83757</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/10115/sahmad43.html" nodeid="10115"&gt;@Sami Ahmad&lt;/A&gt;&lt;P&gt;The sqoop output is generating a orc snappy file and the hive table you have created is a orc table without any compression. &lt;/P&gt;&lt;P&gt;Do create a table with compression type snappy.&lt;/P&gt;&lt;PRE&gt;CREATE TABLE mytable (...) STORED AS orc tblproperties ("orc.compress"="SNAPPY");
&lt;/PRE&gt;</description>
      <pubDate>Sat, 22 Sep 2018 12:28:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/malformed-ORC-file-format/m-p/201956#M83757</guid>
      <dc:creator>sandyy006</dc:creator>
      <dc:date>2018-09-22T12:28:04Z</dc:date>
    </item>
  </channel>
</rss>

