Reply
New Contributor
Posts: 2
Registered: ‎05-02-2017

Sqoop complains about "incompatible parameters" when using Teradata CDH connector

Sqoop version: 1.4.6-cdh5.5.1

TDCH connector: 1.5c5

 

The following sqoop command errors out:

 

sqoop import --connect 'jdbc:teradata://<hostname>/dbs_port=1025,database=<dbname>' --username 'infauser' --password '*****' --table 'CUST' --columns "CLIENT_ID" --target-dir /user/unifi/tmp/schedule_10_2017-02-01_22_03_30_766189/default.10_7_5_BIW_T_CUST_1485986611_3516 --split-by 'CLIENT_ID' --hive-import --hive-overwrite --hive-table default.10_7_5_BIW_T_CUST_1485986611_3516 --null-string '\\N' --null-non-string '\\N' --hive-delims-replacement ' ' --bindir /tmp/unifi/7_teradata-test-one-col/schedule_10_2017-02-01_22_03_30_766189 --outdir /tmp/unifi/7_teradata-test-one-col/schedule_10_2017-02-01_22_03_30_766189 -- --schema BIW_T --input-method 'split.by.amp' > /tmp/unifi/7_teradata-test-one-col/schedule_10_2017-02-01_22_03_30_766189/default.10_7_5_BIW_T_CUST_1485986611_3516.log 2>&1

17/02/01 22:03:33 INFO teradata.TeradataManagerFactory: Loaded connector factory for 'Cloudera Connector Powered by Teradata' on version 1.5c5
17/02/01 22:03:33 INFO manager.SqlManager: Using default fetchSize of 1000
17/02/01 22:03:33 INFO options.ExtraOptions: Parsing extra arguments
17/02/01 22:03:33 INFO options.OptionsCompatibility: Checking options compatibility
17/02/01 22:03:33 ERROR tool.BaseSqoopTool: Got error creating database manager: java.lang.IllegalArgumentException: Detected incompatible parameters: Unsupported parameter: --hive-delims-replacement
at com.cloudera.connector.teradata.options.OptionsCompatibility.throwIllegalArgumentException(OptionsCompatibility.java:266)
at com.cloudera.connector.teradata.options.OptionsCompatibility.unsupportedArgument(OptionsCompatibility.java:262)
at com.cloudera.connector.teradata.options.OptionsCompatibility.check(OptionsCompatibility.java:87)
at com.cloudera.connector.teradata.TeradataManager.<init>(TeradataManager.java:78)
at com.cloudera.connector.teradata.TeradataManagerFactory.accept(TeradataManagerFactory.java:34)
at org.apache.sqoop.manager.ManagerFactory.accept(ManagerFactory.java:50)
at org.apache.sqoop.ConnFactory.getManager(ConnFactory.java:184)
at org.apache.sqoop.tool.BaseSqoopTool.init(BaseSqoopTool.java:258)
at org.apache.sqoop.tool.ImportTool.init(ImportTool.java:89)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:593)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

 

The same command works fine if the 'hive-delims-replacement' option is removed :

 

sqoop import --connect 'jdbc:teradata://<hostname>/dbs_port=1025,database=<dbname>' --username 'infauser' --password 'infauser' --table 'CUST' --columns "CLIENT_ID" --target-dir /user/unifi/tmp/schedule_10_2017-02-01_22_03_30_766189/default.10_7_5_BIW_T_CUST_1485986611_3516 --split-by 'CLIENT_ID' --hive-import --hive-overwrite --hive-table default.10_7_5_BIW_T_CUST_1485986611_3516 --null-string '\\N' --null-non-string '\\N' --bindir /tmp/unifi/7_teradata-test-one-col/schedule_10_2017-02-01_22_03_30_766189 --outdir /tmp/unifi/7_teradata-test-one-col/schedule_10_2017-02-01_22_03_30_766189 -- --schema BIW_T --input-method 'split.by.amp'
Warning: /opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/02/01 16:09:39 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.1
17/02/01 16:09:39 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/02/01 16:09:39 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
17/02/01 16:09:39 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
17/02/01 16:09:39 INFO teradata.TeradataManagerFactory: Loaded connector factory for 'Cloudera Connector Powered by Teradata' on version 1.5c5
17/02/01 16:09:39 INFO manager.SqlManager: Using default fetchSize of 1000
17/02/01 16:09:39 INFO options.ExtraOptions: Parsing extra arguments
17/02/01 16:09:39 INFO options.OptionsCompatibility: Checking options compatibility
17/02/01 16:09:40 INFO tool.CodeGenTool: Beginning code generation
17/02/01 16:09:40 INFO teradata.TeradataManager: Converting table import to query: SELECT * FROM "CUST"
...

...

...

...

17/02/01 16:09:43 INFO processor.TeradataInputProcessor: the number of mappers are 4
17/02/01 16:09:43 INFO processor.TeradataInputProcessor: input preprocessor com.teradata.connector.teradata.processor.TeradataSplitByValueProcessor ends at: 1485986983838
17/02/01 16:09:43 INFO processor.TeradataInputProcessor: the total elapsed time of input preprocessor com.teradata.connector.teradata.processor.TeradataSplitByValueProcessor is: 0s
17/02/01 16:09:43 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
17/02/01 16:09:46 WARN mapred.ResourceMgrDelegate: getBlacklistedTrackers - Not implemented yet
17/02/01 16:09:46 INFO teradata.TeradataSplitByValueInputFormat: SELECT MIN( CLIENT_ID ), MAX( CLIENT_ID ) FROM "CUST"
17/02/01 16:09:47 INFO mapreduce.JobSubmitter: number of splits:4
17/02/01 16:09:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1484854734117_0024
17/02/01 16:09:47 INFO impl.YarnClientImpl: Submitted application application_1484854734117_0024
17/02/01 16:09:47 INFO mapreduce.Job: The url to track the job: http://stluhdpncnb01.monsanto.com:8088/proxy/application_1484854734117_0024/
17/02/01 16:09:47 INFO mapreduce.Job: Running job: job_1484854734117_0024
17/02/01 16:09:54 INFO mapreduce.Job: Job job_1484854734117_0024 running in uber mode : false
17/02/01 16:09:54 INFO mapreduce.Job: map 0% reduce 0%
17/02/01 16:10:03 INFO mapreduce.Job: map 25% reduce 0%
17/02/01 16:10:04 INFO mapreduce.Job: map 100% reduce 0%
17/02/01 16:10:04 INFO mapreduce.Job: Job job_1484854734117_0024 completed successfully
17/02/01 16:10:04 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=617992
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=964
HDFS: Number of bytes written=354520
HDFS: Number of read operations=16
HDFS: Number of large read operations=0
HDFS: Number of write operations=8
Job Counters
Launched map tasks=4
Data-local map tasks=4
Total time spent by all maps in occupied slots (ms)=24858
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=24858
Total vcore-seconds taken by all map tasks=24858
Total megabyte-seconds taken by all map tasks=25454592
Map-Reduce Framework
Map input records=88630
Map output records=88630
Input split bytes=964
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=112
CPU time spent (ms)=15900
Physical memory (bytes) snapshot=1323118592
Virtual memory (bytes) snapshot=6535929856
Total committed heap usage (bytes)=3032481792
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=0
17/02/01 16:10:04 INFO processor.TeradataInputProcessor: input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByValueProcessor starts at: 1485987004184
17/02/01 16:10:04 INFO processor.TeradataInputProcessor: input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByValueProcessor ends at: 1485987004184
17/02/01 16:10:04 INFO processor.TeradataInputProcessor: the total elapsed time of input postprocessor com.teradata.connector.teradata.processor.TeradataSplitByValueProcessor is: 0s
17/02/01 16:10:04 INFO mapreduce.ImportJobBase: Transferred 346.2109 KB in 21.4619 seconds (16.1314 KB/sec)
17/02/01 16:10:04 INFO mapreduce.ImportJobBase: Retrieved 88630 records.
17/02/01 16:10:04 INFO teradata.TeradataManager: Converting table import to query: SELECT * FROM "CUST"
17/02/01 16:10:04 INFO hive.HiveImport: Loading uploaded data into Hive

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/jars/hive-common-1.1.0-cdh5.5.1.jar!/hive-log4j.properties
OK
Time taken: 1.922 seconds

 

The problem only happens when TDCH connector is used. For other connectors (oracle, postgres etc), we are able to successfully use the '--hive-delims-replacement' option of sqoop without any issue. How does one guard against the presence of hive delimiter characters in the column data?

Posts: 336
Topics: 11
Kudos: 48
Solutions: 27
Registered: ‎09-02-2016

Re: Sqoop complains about "incompatible parameters" when using Teradata CDH connector

@superselector

 

The reason might due to the below
1. You have mentioned -- twice before schema as follows: -- --schema BIW_T
2. hive-delims-replacement should only be used if you use Hive’s default delimiters and should not be used if different delimiters are specified

https://community.hortonworks.com/questions/11238/hive-drop-import-delims-or-hive-delims-replacement...

New Contributor
Posts: 2
Registered: ‎05-02-2017

Re: Sqoop complains about "incompatible parameters" when using Teradata CDH connector

[ Edited ]

@ saranvisa Thanks for your response

 

Regarding point 1, the '--' before '--schema' is a legit sqoop. From sqoop's documentation: "If the argument -- is given on the command-line, then subsequent arguments are sent directly to the underlying tool" (the second run which removes --hive-delims-replacement option also has the '--' before '--schema' and it executes successfully). So, I don't think this is the issue.

 

Regarding point 2, I am only intending to replace the default hive delimiters with a space characters (not custom delimiters). The intent is to replace any new line characters (which is a hive delimiter) that might appear within a column data string with a space so that it doesn't break the row at wrong place when data is loaded into a hive table. The issue here is that the tool immediately refuses to honor this --hive-delims-replacement param when teradata CDH connector is used. Like I mentioned, other connectors that come pre-packaged with sqoop do not have this problem and they honor the '--hive-delims-replacement' param as expected.

Explorer
Posts: 20
Registered: ‎05-02-2017

Re: Sqoop complains about "incompatible parameters" when using Teradata CDH connector

[ Edited ]

Looks like a comptable issue the error stack trace clearly let us know  that it is illegal arugement 

 

java.lang.IllegalArgumentException: Detected incompatible parameters: Unsupported parameter: --hive-delims-replacement

 

I didnt try it but just curious to know if you had a chance to try using the Teradata 1.6c5 . 

 

 

 

Announcements
New solutions