<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Processing Fixed Width Files in Hive Using Native (Non-UTF8) Character Sets in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Processing-Fixed-Width-Files-in-Hive-Using-Native-Non-UTF8/m-p/196232#M80961</link>
    <description>&lt;P&gt;Hi, &lt;/P&gt;&lt;P&gt;I have a requirement to load Fixed Width file in hive table where input file is not always UTF-8 encoded. &lt;/P&gt;&lt;P&gt;
I found 2 different classes are available for this - 'org.apache.hadoop.hive.serde2.RegexSerDe' to read from fixed width file on defined offset values and 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' for non utf8 encoding. But unable to use them together when creating external table. &lt;/P&gt;&lt;P&gt;
Can someone of you please help me with a solution. Thanks in advance!!&lt;/P&gt;</description>
    <pubDate>Sat, 21 Jul 2018 16:09:21 GMT</pubDate>
    <dc:creator>aniruddha_ghosh</dc:creator>
    <dc:date>2018-07-21T16:09:21Z</dc:date>
    <item>
      <title>Processing Fixed Width Files in Hive Using Native (Non-UTF8) Character Sets</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Processing-Fixed-Width-Files-in-Hive-Using-Native-Non-UTF8/m-p/196232#M80961</link>
      <description>&lt;P&gt;Hi, &lt;/P&gt;&lt;P&gt;I have a requirement to load Fixed Width file in hive table where input file is not always UTF-8 encoded. &lt;/P&gt;&lt;P&gt;
I found 2 different classes are available for this - 'org.apache.hadoop.hive.serde2.RegexSerDe' to read from fixed width file on defined offset values and 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' for non utf8 encoding. But unable to use them together when creating external table. &lt;/P&gt;&lt;P&gt;
Can someone of you please help me with a solution. Thanks in advance!!&lt;/P&gt;</description>
      <pubDate>Sat, 21 Jul 2018 16:09:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Processing-Fixed-Width-Files-in-Hive-Using-Native-Non-UTF8/m-p/196232#M80961</guid>
      <dc:creator>aniruddha_ghosh</dc:creator>
      <dc:date>2018-07-21T16:09:21Z</dc:date>
    </item>
    <item>
      <title>Re: Processing Fixed Width Files in Hive Using Native (Non-UTF8) Character Sets</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Processing-Fixed-Width-Files-in-Hive-Using-Native-Non-UTF8/m-p/196233#M80962</link>
      <description>&lt;P&gt;I would just read the table with the LazySimpleSerDe and use the substr() function to extract out the columns. I've found that to be more performant than the RegexSerDe and it's clearer to read. You can either run the substring query directly or put it in a view.&lt;/P&gt;</description>
      <pubDate>Mon, 23 Jul 2018 20:35:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Processing-Fixed-Width-Files-in-Hive-Using-Native-Non-UTF8/m-p/196233#M80962</guid>
      <dc:creator>sweeks</dc:creator>
      <dc:date>2018-07-23T20:35:17Z</dc:date>
    </item>
    <item>
      <title>Re: Processing Fixed Width Files in Hive Using Native (Non-UTF8) Character Sets</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Processing-Fixed-Width-Files-in-Hive-Using-Native-Non-UTF8/m-p/196234#M80963</link>
      <description>&lt;P&gt;Thank you Shawn for your prompt response. I found an alternate way. Did UTF-8 conversion using &lt;STRONG&gt;iconv&lt;/STRONG&gt; before reading in external table with RegexSerDe. In my case Hive by default supports UTF-8 charactersets.&lt;/P&gt;</description>
      <pubDate>Fri, 27 Jul 2018 02:25:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Processing-Fixed-Width-Files-in-Hive-Using-Native-Non-UTF8/m-p/196234#M80963</guid>
      <dc:creator>aniruddha_ghosh</dc:creator>
      <dc:date>2018-07-27T02:25:18Z</dc:date>
    </item>
  </channel>
</rss>

