Created 09-15-2016 08:28 AM
I am trying to import data from teradata to HDFS using both teradata manager and jdbc driver . Using jdbc driver it is working fine but for teradata manager it is not working as expected. I am not getting any error. Below is the sqoop commands.
Using JDBC Driver:
sqoop import --driver com.teradata.jdbc.TeraDriver --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test87 --compress -m 1
Output:
-rw-r--r-- 3 ***** hdfs 0 2016-09-15 13:45 /user/aps/test87/_SUCCESS
-rw-r--r-- 3 ***** hdfs 38 2016-09-15 13:45 /user/aps/test87/part-m-00000.gz
Using Teradata Manager :
sqoop import --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test88 --compress -m 1
Output:
-rw-r--r-- 3 ****** hdfs 0 2016-09-15 13:46 /user/aps/test88/_SUCCESS
-rw-r--r-- 3 ****** hdfs 18 2016-09-15 13:46 /user/aps/test88/part-m-00000
For Teradata Manager output should be .gz file. Am I doing something wrong. Please help.
I am facing same issue for snappy, parquet, BZip2, avro . Please help asap.
Created 09-15-2016 09:54 AM
@Arkaprova, Please use,
--compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec
in the command, you will get the result in proper format.
Created 09-15-2016 09:54 AM
@Arkaprova, Please use,
--compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec
in the command, you will get the result in proper format.
Created 09-15-2016 09:59 AM
I have already checked with org.apache.hadoop.io.compress.SnappyCodec. This is not working for me.
Sqoop command:
sqoop import --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test85 --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec -m 1
Output:
-rw-r--r-- 3 ****** hdfs 0 2016-09-15 13:39 /user/aps/test85/_SUCCESS
-rw-r--r-- 3 ****** hdfs 18 2016-09-15 13:39 /user/aps/test85/part-m-00000
Please help.
Created 09-15-2016 10:07 AM
Add these configurations in command as well,
-D mapreduce.output.fileoutputformat.compress=true
-D mapreduce.output.fileoutputformat.compress.type=BLOCK
-D mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec
Created 09-15-2016 10:54 AM
@Nitin Shelke This is working after adding this configuration. Thanks a lot.
Created 09-15-2016 11:12 AM
Below Sqoop commands are working for me.
For Snappy:
sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test95 -m 1
For BZip2:
sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test96 -m 1
For lzo:
sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test98 -m 1