Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

malformed ORC file format

avatar
Super Collaborator

here is my sqoop command .

sqoop job -Dmapreduce.job.user.classpath.first=true --create incjob2  -- import --connect "jdbc:oracle:thin:@(description=(address=(protocol=tcp)(host=patronQA)(port=1526))(connect_data=(service_name=patron)))" --username PATRON  --incremental append --check-column INSERT_TIME --table PATRON.UFM -split-by UFM.UFMID  --hcatalog-storage-stanza "stored as orcfile" --compression-codec snappy  --target-dir /user/sami

here is my create external table command

CREATE EXTERNAL TABLE IF NOT EXISTS ufm_orc (
..
..
 )
STORED AS ORC location '/user/sami'

here is the error , as you can see both table input and output format is ORC

SerDe Library:          org.apache.hadoop.hive.ql.io.orc.OrcSerde
InputFormat:            org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
OutputFormat:           org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
Compressed:             No
Num Buckets:            -1
Bucket Columns:         []
Sort Columns:           []
Storage Desc Params:
        serialization.format    1
Time taken: 0.495 seconds, Fetched: 217 row(s)

    > select ufmid,insert_time from ufm_orc limit 10;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.io.FileFormatException: Malformed ORC file hdfs://hadoop1.tolls.dot.state.fl.us:8020/user/sami/part-m-00000.snappy. Invalid postscript.
Time taken: 0.328 seconds
1 ACCEPTED SOLUTION

avatar
@Sami Ahmad

The sqoop output is generating a orc snappy file and the hive table you have created is a orc table without any compression.

Do create a table with compression type snappy.

CREATE TABLE mytable (...) STORED AS orc tblproperties ("orc.compress"="SNAPPY");

View solution in original post

1 REPLY 1

avatar
@Sami Ahmad

The sqoop output is generating a orc snappy file and the hive table you have created is a orc table without any compression.

Do create a table with compression type snappy.

CREATE TABLE mytable (...) STORED AS orc tblproperties ("orc.compress"="SNAPPY");