Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

What is best way to parse 200-column wide sqoop file with field delimiter '\001'. Need to find bad characters/data that is screwing up my external tables pointing to sqoop file.

Highlighted

What is best way to parse 200-column wide sqoop file with field delimiter '\001'. Need to find bad characters/data that is screwing up my external tables pointing to sqoop file.

New Contributor
 
2 REPLIES 2
Highlighted

Re: What is best way to parse 200-column wide sqoop file with field delimiter '\001'. Need to find bad characters/data that is screwing up my external tables pointing to sqoop file.

@Brian Twidt

Are you getting correct records count from source database to HDFS while importing data using sqoop?

Highlighted

Re: What is best way to parse 200-column wide sqoop file with field delimiter '\001'. Need to find bad characters/data that is screwing up my external tables pointing to sqoop file.

Expert Contributor
@Brian Twidt

What you can do is to use sed like below to give a comma delimited csv file which you can open in excel. Your delimiter is most probably \001.

sed 's/yourdelimiter/,/g' < input > output.csv

Don't have an account?
Coming from Hortonworks? Activate your account here