<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: comma in between data of csv mapped to external table in hive in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220195#M74696</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt; &lt;/P&gt;&lt;P&gt;There is no quote characters in csv.&lt;/P&gt;&lt;P&gt;input is:&lt;/P&gt;&lt;P&gt;a, quick, &lt;STRONG&gt;brown,fox jumps&lt;/STRONG&gt;, over, &lt;STRONG&gt;the, lazy&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;not &lt;/P&gt;&lt;P&gt;a, quick,"brown,fox jumps",over,"the, lazy"&lt;/P&gt;</description>
    <pubDate>Tue, 20 Feb 2018 05:28:54 GMT</pubDate>
    <dc:creator>mark_hadoop</dc:creator>
    <dc:date>2018-02-20T05:28:54Z</dc:date>
    <item>
      <title>comma in between data of csv mapped to external table in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220193#M74694</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I am getting a huge csv ingested in to nifi to process to a location.&lt;/P&gt;&lt;P&gt;The location is an external table location, from there data is processed in to orc tables.&lt;/P&gt;&lt;P&gt;I am getting comma(,) in between data of csv, can you please help me to handle it.&lt;/P&gt;&lt;P&gt;ex:&lt;/P&gt;&lt;P&gt;file: (here below are 5 fields "brown,fox jumps" and "the, lazy" are single fields)&lt;/P&gt;&lt;P&gt;a, quick, &lt;STRONG&gt;brown,fox jumps&lt;/STRONG&gt;, over, &lt;STRONG&gt;the, lazy&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;When an external table is placed on the file only "a, quick, brown, fox, jumps" are shown&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Sun, 18 Feb 2018 22:11:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220193#M74694</guid>
      <dc:creator>mark_hadoop</dc:creator>
      <dc:date>2018-02-18T22:11:19Z</dc:date>
    </item>
    <item>
      <title>Re: comma in between data of csv mapped to external table in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220194#M74695</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/23208/hadoopuserhadoop.html" nodeid="23208"&gt;@Mark&lt;/A&gt;&lt;P&gt;Use &lt;STRONG&gt;csv serde to escape quote characters in csv file&lt;/STRONG&gt;,&lt;/P&gt;&lt;PRE&gt;ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   "separatorChar" = ",",
   "quoteChar"     = "\"")&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Example:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;input data:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;a, quick,"brown,fox jumps",over,"the, lazy"&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Crete table statement:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;create table hcc(field1 string,
field2 string,
field3 string,
field4 string,
field5 string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   "separatorChar" = ",",
   "quoteChar"     = "\"");&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;Select the data in table&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;hive&amp;gt; select * from hcc;
+---------+---------+------------------+---------+------------+--+
| field1  | field2  |      field3      | field4  |   field5   |
+---------+---------+------------------+---------+------------+--+
| a       |  quick  | brown,fox jumps  | over    | the, lazy  |
+---------+---------+------------------+---------+------------+--+
1 row selected (0.055 seconds)&lt;/PRE&gt;&lt;P&gt;So in our create table statement we have mentioned quote character as " and seperator as ,. When we query table hive considers all the data enclosing quotes as one filed.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Feb 2018 22:50:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220194#M74695</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2018-02-18T22:50:36Z</dc:date>
    </item>
    <item>
      <title>Re: comma in between data of csv mapped to external table in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220195#M74696</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt; &lt;/P&gt;&lt;P&gt;There is no quote characters in csv.&lt;/P&gt;&lt;P&gt;input is:&lt;/P&gt;&lt;P&gt;a, quick, &lt;STRONG&gt;brown,fox jumps&lt;/STRONG&gt;, over, &lt;STRONG&gt;the, lazy&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;not &lt;/P&gt;&lt;P&gt;a, quick,"brown,fox jumps",over,"the, lazy"&lt;/P&gt;</description>
      <pubDate>Tue, 20 Feb 2018 05:28:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220195#M74696</guid>
      <dc:creator>mark_hadoop</dc:creator>
      <dc:date>2018-02-20T05:28:54Z</dc:date>
    </item>
    <item>
      <title>Re: comma in between data of csv mapped to external table in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220196#M74697</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/23208/hadoopuserhadoop.html" nodeid="23208"&gt;@Mark&lt;/A&gt; &lt;/P&gt;&lt;P&gt;You need to use &lt;STRONG&gt;Regex serde &lt;/STRONG&gt;while creating hive table and matching regex to capture the fields that you need to have them in same group.&lt;/P&gt;&lt;P&gt;Some references how to create regex serde tables&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/articles/58591/using-regular-expressions-to-extract-fields-for-hi.html" target="_blank"&gt;https://community.hortonworks.com/articles/58591/using-regular-expressions-to-extract-fields-for-hi.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://stackoverflow.com/questions/31008371/hive-using-regexserde-to-define-input-format" target="_blank"&gt;https://stackoverflow.com/questions/31008371/hive-using-regexserde-to-define-input-format&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://stackoverflow.com/questions/9102184/regex-for-access-log-in-hive-serde" target="_blank"&gt;https://stackoverflow.com/questions/9102184/regex-for-access-log-in-hive-serde&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 21 Feb 2018 08:04:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220196#M74697</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2018-02-21T08:04:00Z</dc:date>
    </item>
    <item>
      <title>Re: comma in between data of csv mapped to external table in hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220197#M74698</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt;&lt;BR /&gt;&lt;P&gt;Thanks, that works.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Mar 2018 21:43:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/comma-in-between-data-of-csv-mapped-to-external-table-in/m-p/220197#M74698</guid>
      <dc:creator>mark_hadoop</dc:creator>
      <dc:date>2018-03-14T21:43:06Z</dc:date>
    </item>
  </channel>
</rss>

