Created 05-17-2016 03:29 AM
Hi,
I need to load Netezza exported data file into Hive.
The data field is delimited by ESCAPE. I tried the below but it didn't work.
create external table foo ( bar strign ) row format delimited fields terminated by '\01B' stored as textfile.
How can I load ESCAPE delimited data into Hive?
Thanks,
Created 05-17-2016 03:44 PM
Thanks @Sunile Manjee
I wrote a simple Pig script to convert the 'escape' character to '\t' and it worked.
raw = load '/tmp/mydata' using PigStorage('\x1B') store raw into '/tmp/output' using PigStorage('\t')
Created 05-17-2016 03:34 AM
The ASCII "escape" character (octal: \033, hexadecimal: \x1B, or ^[, or, in decimal, 27) is used in many output devices to start a series of characters called a control sequence or escape sequence.
Besides that how about replacing the escape character with something more familiar like ',' and then load into hive? this can be done with pig or simple sed command.
Created 05-17-2016 03:37 AM
The data size is pretty big. It would be ideal to load into Hive directly and convert to ORC.
I will try using Pig to convert ESCAPE to something else.
Created 05-17-2016 03:43 AM
try the octal repesentation either \073 or \033
Created 05-17-2016 03:44 PM
Thanks @Sunile Manjee
I wrote a simple Pig script to convert the 'escape' character to '\t' and it worked.
raw = load '/tmp/mydata' using PigStorage('\x1B') store raw into '/tmp/output' using PigStorage('\t')