Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to handle file delimiter char(31) i.e UnitSeparator, after placing my source file into HDFS

Highlighted

How to handle file delimiter char(31) i.e UnitSeparator, after placing my source file into HDFS

New Contributor

We are getting Source files with delimiter ASCII Code char(31) which is a UnitSeparator in Windows OS. In Lynux it is combination of char(94), char(95). I am able to split the lines in Windows and Lynux files system. However, after placing my file in HDFS reading the file and trying to split the lines - which is not working as expected.

Can anyone give me , how to handle this scenario? We can replace that special ASCII character with some other printable character before placing into HDFS, but is there any other solution where we can handle those non-typeable character.

Thank you.