Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Processing Fixed Width Files in Hive Using Native (Non-UTF8) Character Sets

avatar

Hi,

I have a requirement to load Fixed Width file in hive table where input file is not always UTF-8 encoded.

I found 2 different classes are available for this - 'org.apache.hadoop.hive.serde2.RegexSerDe' to read from fixed width file on defined offset values and 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' for non utf8 encoding. But unable to use them together when creating external table.

Can someone of you please help me with a solution. Thanks in advance!!

1 ACCEPTED SOLUTION

avatar
Expert Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login
2 REPLIES 2

avatar
Expert Contributor
hide-solution

This problem has been solved!

Want to get a detailed solution you have to login/registered on the community

Register/Login

avatar

Thank you Shawn for your prompt response. I found an alternate way. Did UTF-8 conversion using iconv before reading in external table with RegexSerDe. In my case Hive by default supports UTF-8 charactersets.