<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Help with Hive Regex extract. in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191286#M59096</link>
    <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/12437/balavigneshnagamuthuvenkatesan.html" nodeid="12437"&gt;@Bala Vignesh N V&lt;/A&gt;&lt;/P&gt;&lt;P&gt;when I use the following SQL its not returning me anything..i am not getting any error.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;select regexp_extract('Mar 12 04:03:01 172.16.3.1 %ASA-6-106100: access-list FW-DATA permitted tcp FW-DATA/172.16.1.4(59289) -&amp;gt; OUTSIDE/52.87.195.145(22) hit-cnt 1 first hit', '([A-Z][a-z]+ \d+\s\d+:\d+:\d+)\s(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+(\%ASA-6-106100):\s+(\w+-\w+\s+\w+-\w+)\s+(\w+)\s+(\w+)\s+(\w+-\w+)\/?(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\((\d+)\)\s+-&amp;gt;\s+(\w+-?\w+?)\/?(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\((\d+)\)\s+(\w+-\w+\s+\d+\s+\w+\s+\w+)',0)&lt;/P&gt;&lt;P&gt;How do I get my input string broken in to multiple columns based on the regex.?&lt;/P&gt;</description>
    <pubDate>Fri, 07 Apr 2017 03:53:20 GMT</pubDate>
    <dc:creator>saikrishna_tara</dc:creator>
    <dc:date>2017-04-07T03:53:20Z</dc:date>
    <item>
      <title>Help with Hive Regex extract.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191284#M59094</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;i have a firewall log with entries like this..&lt;/P&gt;&lt;P&gt;Mar 12 04:03:01 
172.16.3.1 
%ASA-6-106100
access-list FW-DATA 
permitted 
tcp 
FW-DATA
172.16.1.4
59289 
OUTSIDE
52.87.195.145
22
hit-cnt 1 first hit&lt;/P&gt;&lt;P&gt;i created an external table in hive for this log file and i am trying to use HIVE SQL and regexp_extract to extract column out of this lines by using this regular expression.&lt;/P&gt;&lt;P&gt;([A-Z][a-z]+ \d+\s\d+:\d+:\d+)\s(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+(\%ASA-6-106100):\s+(\w+-\w+\s+\w+-\w+)\s+(\w+)\s+(\w+)\s+(\w+-\w+)\/?(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\((\d+)\)\s+-&amp;gt;\s+(\w+-?\w+?)\/?(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\((\d+)\)\s+(\w+-\w+\s+\d+\s+\w+\s+\w+)&lt;/P&gt;&lt;P&gt;when i tested this at different sites (like &lt;A href="https://regex101.com/)" target="_blank"&gt;https://regex101.com/)&lt;/A&gt; ..it is working fine..breaking this line into many different groups shown below. but when i try to use that in HIVE its not working..any help is appriciated.&lt;/P&gt;&lt;P&gt;Mar 12 04:03:01 &lt;/P&gt;&lt;P&gt;
172.16.3.1 &lt;/P&gt;&lt;P&gt;
%ASA-6-106100 &lt;/P&gt;&lt;P&gt;access-list FW-DATA &lt;/P&gt;&lt;P&gt;
permitted &lt;/P&gt;&lt;P&gt;
tcp &lt;/P&gt;&lt;P&gt;
FW-DATA &lt;/P&gt;&lt;P&gt;172.16.1.4 &lt;/P&gt;&lt;P&gt;59289 &lt;/P&gt;&lt;P&gt;
OUTSIDE &lt;/P&gt;&lt;P&gt;52.87.195.145 &lt;/P&gt;&lt;P&gt;22 &lt;/P&gt;&lt;P&gt;hit-cnt 1 first hit
&lt;/P&gt;</description>
      <pubDate>Thu, 06 Apr 2017 23:30:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191284#M59094</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2017-04-06T23:30:18Z</dc:date>
    </item>
    <item>
      <title>Re: Help with Hive Regex extract.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191285#M59095</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11732/saikrishnatarapareddy.html" nodeid="11732"&gt;@Saikrishna Tarapareddy&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Could you help me with your query? Is it throwing an error when you are using it in hive or the data which you were expecting is not replicated by the code?&lt;/P&gt;</description>
      <pubDate>Fri, 07 Apr 2017 02:18:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191285#M59095</guid>
      <dc:creator>balavignesh_nag</dc:creator>
      <dc:date>2017-04-07T02:18:17Z</dc:date>
    </item>
    <item>
      <title>Re: Help with Hive Regex extract.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191286#M59096</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/12437/balavigneshnagamuthuvenkatesan.html" nodeid="12437"&gt;@Bala Vignesh N V&lt;/A&gt;&lt;/P&gt;&lt;P&gt;when I use the following SQL its not returning me anything..i am not getting any error.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;select regexp_extract('Mar 12 04:03:01 172.16.3.1 %ASA-6-106100: access-list FW-DATA permitted tcp FW-DATA/172.16.1.4(59289) -&amp;gt; OUTSIDE/52.87.195.145(22) hit-cnt 1 first hit', '([A-Z][a-z]+ \d+\s\d+:\d+:\d+)\s(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+(\%ASA-6-106100):\s+(\w+-\w+\s+\w+-\w+)\s+(\w+)\s+(\w+)\s+(\w+-\w+)\/?(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\((\d+)\)\s+-&amp;gt;\s+(\w+-?\w+?)\/?(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\((\d+)\)\s+(\w+-\w+\s+\d+\s+\w+\s+\w+)',0)&lt;/P&gt;&lt;P&gt;How do I get my input string broken in to multiple columns based on the regex.?&lt;/P&gt;</description>
      <pubDate>Fri, 07 Apr 2017 03:53:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191286#M59096</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2017-04-07T03:53:20Z</dc:date>
    </item>
    <item>
      <title>Re: Help with Hive Regex extract.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191287#M59097</link>
      <description>&lt;P&gt;You can consider to use Hive RegexSerDe, see &lt;A href="https://community.hortonworks.com/articles/58591/using-regular-expressions-to-extract-fields-for-hi.html"&gt;here&lt;/A&gt; for details.&lt;/P&gt;</description>
      <pubDate>Fri, 07 Apr 2017 04:49:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191287#M59097</guid>
      <dc:creator>pminovic</dc:creator>
      <dc:date>2017-04-07T04:49:30Z</dc:date>
    </item>
    <item>
      <title>Re: Help with Hive Regex extract.</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191288#M59098</link>
      <description>&lt;P&gt;I think I found the answer..looks like we need to use double slashes in hive..&lt;/P&gt;&lt;P&gt;this is working when I replaced log_entry with the text from above..&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;select 
regexp_extract(log_entry, '[A-Z][a-z]+\\s\\d+\\s\\d+:\\d+:\\d+', 0) , 
regexp_extract(log_entry, '\\d+\\.\\d+\\.\\d\\.\\d', 0) ,
regexp_extract(log_entry, '%ASA-6-106100', 0) ,
regexp_extract(log_entry, '\\w+-\\w+\\s+\\w+-\\w+', 0) &lt;/P&gt;</description>
      <pubDate>Fri, 07 Apr 2017 04:54:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Help-with-Hive-Regex-extract/m-p/191288#M59098</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2017-04-07T04:54:14Z</dc:date>
    </item>
  </channel>
</rss>

