Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

WARN sqoop.SqoopOptions: Character argument || has multiple characters; only the first will be used.

Solved Go to solution
Highlighted

WARN sqoop.SqoopOptions: Character argument || has multiple characters; only the first will be used.

Expert Contributor

While trying to import data from mysql into hdfs using delimiter '||' is not supporting and single '|' character is supported. Is there any way to perfom the same. we have data in the columns with | character so wants to go with '||' as delimiter

1 ACCEPTED SOLUTION

Accepted Solutions

Re: WARN sqoop.SqoopOptions: Character argument || has multiple characters; only the first will be used.

Expert Contributor

@Krishna Srinivas

Sqoop doesn't support multiple-characters delimiters. You have to use single char.

I would suggest to use a character that is native to Hive text file format - ^A

#!/bin/bash

# ...
delim_char=$( printf "\x01" )

sqoop import ...  --input-fields-terminated-by ${delim_char}  ...
# ...
6 REPLIES 6

Re: WARN sqoop.SqoopOptions: Character argument || has multiple characters; only the first will be used.

Expert Contributor

@Krishna Srinivas

Sqoop doesn't support multiple-characters delimiters. You have to use single char.

I would suggest to use a character that is native to Hive text file format - ^A

#!/bin/bash

# ...
delim_char=$( printf "\x01" )

sqoop import ...  --input-fields-terminated-by ${delim_char}  ...
# ...

Re: WARN sqoop.SqoopOptions: Character argument || has multiple characters; only the first will be used.

Expert Contributor

@Ed Berezitsky Thank you and currently we are using '\001' as the delimiter in place of '||'

Re: WARN sqoop.SqoopOptions: Character argument || has multiple characters; only the first will be used.

Rising Star

If HDFS is just an intermediate destination before loading into hive, you can skip the step and directly load into Hive using the hcatalog-table option in sqoop which provides better fidelity of data and removes one step (and supports all Hive data types also)

Please see https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_sqoop_hcatalog_integration

Re: WARN sqoop.SqoopOptions: Character argument || has multiple characters; only the first will be used.

Expert Contributor

@Venkat Ranganathan,

Small correction: if you use hcatalog, but your table is still textfile format with "|" field delimiter, you'll still have the same issue. You, probably, mean to use HCat import with ORC formatted table - that will definitely work.

Re: WARN sqoop.SqoopOptions: Character argument || has multiple characters; only the first will be used.

Expert Contributor

We are not looking at HDFS as an intermediate storage as we will be processsing the files using SPARK SQL .

@Venkat Ranganathan

Re: WARN sqoop.SqoopOptions: Character argument || has multiple characters; only the first will be used.

Rising Star

@Ed Berezitsky

>> Small correction: if you use hcatalog, but your table is still textfile format with "|" field delimiter, you'll still have the same issue

The output file field delimiters are only needed for HDFS imports. In the case of Hcatalog imports, you tell the text file format properties as part of the storage stanza and the defaults for hive will be used. Essentially, the default storage format should be ok to handle this. BTW, hcatalog import works with most storage formats, not just ORC

@Krishna Srinivas

You should be able to use a Hive table using Spark SQL also - but may be you have other requirements also. Glad to see that @Ed Berezitsky's solution worked for you

Don't have an account?
Coming from Hortonworks? Activate your account here