Support Questions

Find answers, ask questions, and share your expertise

Sqoop : Compression not working for teradata manager

avatar
Expert Contributor

I am trying to import data from teradata to HDFS using both teradata manager and jdbc driver . Using jdbc driver it is working fine but for teradata manager it is not working as expected. I am not getting any error. Below is the sqoop commands.

Using JDBC Driver:

sqoop import --driver com.teradata.jdbc.TeraDriver --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test87 --compress -m 1

Output:

-rw-r--r-- 3 ***** hdfs 0 2016-09-15 13:45 /user/aps/test87/_SUCCESS

-rw-r--r-- 3 ***** hdfs 38 2016-09-15 13:45 /user/aps/test87/part-m-00000.gz

Using Teradata Manager :

sqoop import --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test88 --compress -m 1

Output:

-rw-r--r-- 3 ****** hdfs 0 2016-09-15 13:46 /user/aps/test88/_SUCCESS

-rw-r--r-- 3 ****** hdfs 18 2016-09-15 13:46 /user/aps/test88/part-m-00000

For Teradata Manager output should be .gz file. Am I doing something wrong. Please help.

I am facing same issue for snappy, parquet, BZip2, avro . Please help asap.

1 ACCEPTED SOLUTION

avatar
Rising Star

@Arkaprova, Please use,

--compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec

in the command, you will get the result in proper format.

View solution in original post

5 REPLIES 5

avatar
Rising Star

@Arkaprova, Please use,

--compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec

in the command, you will get the result in proper format.

avatar
Expert Contributor

@Nitin Shelke

I have already checked with org.apache.hadoop.io.compress.SnappyCodec. This is not working for me.

Sqoop command:

sqoop import --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test85 --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec -m 1

Output:

-rw-r--r-- 3 ****** hdfs 0 2016-09-15 13:39 /user/aps/test85/_SUCCESS

-rw-r--r-- 3 ****** hdfs 18 2016-09-15 13:39 /user/aps/test85/part-m-00000

Please help.

avatar
Rising Star

Add these configurations in command as well,

-D mapreduce.output.fileoutputformat.compress=true

-D mapreduce.output.fileoutputformat.compress.type=BLOCK

-D mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec

avatar
Expert Contributor

@Nitin Shelke This is working after adding this configuration. Thanks a lot.

avatar
Expert Contributor

Below Sqoop commands are working for me.

For Snappy:

sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test95 -m 1

For BZip2:

sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test96 -m 1 

For lzo:

sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test98 -m 1