- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to load ESCAPE delimited data into Hive
- Labels:
-
Apache Hive
Created ‎05-17-2016 03:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I need to load Netezza exported data file into Hive.
The data field is delimited by ESCAPE. I tried the below but it didn't work.
create external table foo ( bar strign ) row format delimited fields terminated by '\01B' stored as textfile.
How can I load ESCAPE delimited data into Hive?
Thanks,
Created ‎05-17-2016 03:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Sunile Manjee
I wrote a simple Pig script to convert the 'escape' character to '\t' and it worked.
raw = load '/tmp/mydata' using PigStorage('\x1B') store raw into '/tmp/output' using PigStorage('\t')
Created ‎05-17-2016 03:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The ASCII "escape" character (octal: \033, hexadecimal: \x1B, or ^[, or, in decimal, 27) is used in many output devices to start a series of characters called a control sequence or escape sequence.
Besides that how about replacing the escape character with something more familiar like ',' and then load into hive? this can be done with pig or simple sed command.
Created ‎05-17-2016 03:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The data size is pretty big. It would be ideal to load into Hive directly and convert to ORC.
I will try using Pig to convert ESCAPE to something else.
Created ‎05-17-2016 03:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
try the octal repesentation either \073 or \033
Created ‎05-17-2016 03:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @Sunile Manjee
I wrote a simple Pig script to convert the 'escape' character to '\t' and it worked.
raw = load '/tmp/mydata' using PigStorage('\x1B') store raw into '/tmp/output' using PigStorage('\t')
