- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Sqoop : Compression not working for teradata manager
- Labels:
-
Apache Sqoop
Created ‎09-15-2016 08:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to import data from teradata to HDFS using both teradata manager and jdbc driver . Using jdbc driver it is working fine but for teradata manager it is not working as expected. I am not getting any error. Below is the sqoop commands.
Using JDBC Driver:
sqoop import --driver com.teradata.jdbc.TeraDriver --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test87 --compress -m 1
Output:
-rw-r--r-- 3 ***** hdfs 0 2016-09-15 13:45 /user/aps/test87/_SUCCESS
-rw-r--r-- 3 ***** hdfs 38 2016-09-15 13:45 /user/aps/test87/part-m-00000.gz
Using Teradata Manager :
sqoop import --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test88 --compress -m 1
Output:
-rw-r--r-- 3 ****** hdfs 0 2016-09-15 13:46 /user/aps/test88/_SUCCESS
-rw-r--r-- 3 ****** hdfs 18 2016-09-15 13:46 /user/aps/test88/part-m-00000
For Teradata Manager output should be .gz file. Am I doing something wrong. Please help.
I am facing same issue for snappy, parquet, BZip2, avro . Please help asap.
Created ‎09-15-2016 09:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Arkaprova, Please use,
--compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec
in the command, you will get the result in proper format.
Created ‎09-15-2016 09:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Arkaprova, Please use,
--compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec
in the command, you will get the result in proper format.
Created ‎09-15-2016 09:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have already checked with org.apache.hadoop.io.compress.SnappyCodec. This is not working for me.
Sqoop command:
sqoop import --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test85 --compress --compression-codec org.apache.hadoop.io.compress.SnappyCodec -m 1
Output:
-rw-r--r-- 3 ****** hdfs 0 2016-09-15 13:39 /user/aps/test85/_SUCCESS
-rw-r--r-- 3 ****** hdfs 18 2016-09-15 13:39 /user/aps/test85/part-m-00000
Please help.
Created ‎09-15-2016 10:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Add these configurations in command as well,
-D mapreduce.output.fileoutputformat.compress=true
-D mapreduce.output.fileoutputformat.compress.type=BLOCK
-D mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec
Created ‎09-15-2016 10:54 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Nitin Shelke This is working after adding this configuration. Thanks a lot.
Created ‎09-15-2016 11:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Below Sqoop commands are working for me.
For Snappy:
sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test95 -m 1
For BZip2:
sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test96 -m 1
For lzo:
sqoop import -Dmapreduce.output.fileoutputformat.compress=true -Dmapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://**.***.***.***/DATABASE=****** --username ****** --password **** --table mytable --target-dir /user/aps/test98 -m 1
