Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Sqoop : Compression not working for teradata manager

avatar
Expert Contributor

I am trying to import data from teradata to HDFS using both teradata manager and jdbc driver . Using jdbc driver it is working fine but for teradata manager it is not working as expected. I am not getting any error. Below is the sqoop commands.

Using JDBC Driver:

sqoop import --driver com.teradata.jdbc.TeraDriver --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test87 --compress -m 1

Output:

-rw-r--r-- 3 ***** hdfs 0 2016-09-15 13:45 /user/aps/test87/_SUCCESS

-rw-r--r-- 3 ***** hdfs 38 2016-09-15 13:45 /user/aps/test87/part-m-00000.gz

Using Teradata Manager :

sqoop import --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test88 --compress -m 1

Output:

-rw-r--r-- 3 ****** hdfs 0 2016-09-15 13:46 /user/aps/test88/_SUCCESS

-rw-r--r-- 3 ****** hdfs 18 2016-09-15 13:46 /user/aps/test88/part-m-00000

For Teradata Manager output should be .gz file. Am I doing something wrong. Please help.

I am facing same issue for snappy, parquet, BZip2, avro . Please help asap.

1 ACCEPTED SOLUTION

avatar
Rising Star

@Arkaprova, Please use,

--compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec

in the command, you will get the result in proper format.

View solution in original post

5 REPLIES 5

avatar
Rising Star

@Arkaprova, Please use,

--compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec

in the command, you will get the result in proper format.

avatar
Expert Contributor

@Nitin Shelke

I have already checked with org.apache.hadoop.io.compress.SnappyCodec. This is not working for me.

Sqoop command:

sqoop import --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test85 --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec -m 1

Output:

-rw-r--r-- 3 ****** hdfs 0 2016-09-15 13:39 /user/aps/test85/_SUCCESS

-rw-r--r-- 3 ****** hdfs 18 2016-09-15 13:39 /user/aps/test85/part-m-00000

Please help.

avatar
Rising Star

Add these configurations in command as well,

-D mapreduce.output.fileoutputformat.compress=true

-D mapreduce.output.fileoutputformat.compress.type=BLOCK

-D mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec

avatar
Expert Contributor

@Nitin Shelke This is working after adding this configuration. Thanks a lot.

avatar
Expert Contributor

Below Sqoop commands are working for me.

For Snappy:

sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test95 -m 1

For BZip2:

sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test96 -m 1 

For lzo:

sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test98 -m 1