<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Getting exception while inserting the file having multi delimiters in a Hive table in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202635#M164641</link>
    <description>&lt;P&gt;Got it right. Actually whenever I was starting my hive shell, I was getting this warning:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;So I installed Tez(version 0.8.5) and changed the execution engine of Hive to Tez. Now all Hive queries that involve MapReduce job are running. My hive version is 2.1.1, that I guess do not work with MapReduce&lt;/P&gt;&lt;P&gt;As per the regex, thanks alot &lt;A rel="user" href="https://community.cloudera.com/users/14209/eberezitsky.html" nodeid="14209"&gt;@Ed Berezitsky, 
&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/14209/eberezitsky.html" nodeid="14209"&gt;Those regex worked.
&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;
&lt;STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 23 Apr 2017 05:25:55 GMT</pubDate>
    <dc:creator>mrizvi</dc:creator>
    <dc:date>2017-04-23T05:25:55Z</dc:date>
    <item>
      <title>Getting exception while inserting the file having multi delimiters in a Hive table</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202630#M164636</link>
      <description>&lt;P&gt;Hi guys,&lt;/P&gt;&lt;P&gt;I have an input file which looks like:&lt;/P&gt;&lt;PRE&gt;1:Washington Berry Juice   1356:Carrington Frozen Corn-41   446:Red Wing Plastic Knives-39   1133:Tri-State Almonds-41   1252:Skinner Strawberry Drink-39   868:Nationeel Raspberry Fruit Roll-39   360:Carlson Low Fat String Cheese-38   
2:Washington Mango Drink   233:Best Choice Avocado Dip-61   1388:Sunset Paper Plates-63   878:Thresher Semi-Sweet Chocolate Bar-63   529:Fast BBQ Potato Chips-62   382:Moms Roasted Chicken-631	191:Musial Tasty Candy Bar-62  &lt;/PRE&gt;&lt;P&gt;This is the output from user recommendation engine. The first pair is the main product ID and name. Next 6 are ProductId:Name:Count and all the 6 products are delimited by tab.&lt;/P&gt;&lt;P&gt;I want to load this data in a Hive table. As you can see here, there are multi delimeters, so I created a temporary table first having only one string column and then inserted this file. Next, i created a final table having the correct attributes and data types. Now when I am inserting the data using regular expression by running the query:&lt;/P&gt;&lt;PRE&gt;insert overwrite table recommendation SELECT  
regexp_extract(col_value, '^(?:([^,]*),?){1}', 1) productId,
regexp_extract(col_value, '^(?:([^,]*),?){2}', 1) productName,  
regexp_extract(col_value, '^(?:([^,]*),?){3}', 1) productId1,  
regexp_extract(col_value, '^(?:([^,]*),?){4}', 1) productName1,  
regexp_extract(col_value, '^(?:([^,]*),?){5}', 1) productCount1,  
regexp_extract(col_value, '^(?:([^,]*),?){6}', 1) productId2,  
regexp_extract(col_value, '^(?:([^,]*),?){7}', 1) productName2,  
regexp_extract(col_value, '^(?:([^,]*),?){8}', 1) productCount2,  
regexp_extract(col_value, '^(?:([^,]*),?){9}', 1) productId3,  
regexp_extract(col_value, '^(?:([^,]*),?){10}', 1) productName3,  
regexp_extract(col_value, '^(?:([^,]*),?){11}', 1) productCount3,  
regexp_extract(col_value, '^(?:([^,]*),?){12}', 1) productId4,  
regexp_extract(col_value, '^(?:([^,]*),?){13}', 1) productName4,  
regexp_extract(col_value, '^(?:([^,]*),?){14}', 1) productCount4,  
regexp_extract(col_value, '^(?:([^,]*),?){15}', 1) productId5,  
regexp_extract(col_value, '^(?:([^,]*),?){16}', 1) productName5,  
regexp_extract(col_value, '^(?:([^,]*),?){17}', 1) productCount5,  
regexp_extract(col_value, '^(?:([^,]*),?){18}', 1) productId6,  
regexp_extract(col_value, '^(?:([^,]*),?){19}', 1) productName6,  
regexp_extract(col_value, '^(?:([^,]*),?){20}', 1) productCount6
from temp_recommendation;&lt;/PRE&gt;&lt;P&gt;I am getting this exception:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org.apache.hadoop.mapreduce.v2.util.MRApps.addLog4jSystemProperties(Lorg/apache/hadoop/mapred/Task;Ljava/util/List;Lorg/apache/hadoop/conf/Configuration;)V
&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;There are no logs generated and this is a pseudo distributed machine. Is this method wrong for handling multi delimiters or is there any other other better way? Thanks in advance&lt;/P&gt;&lt;P&gt;
&lt;STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Apr 2017 08:50:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202630#M164636</guid>
      <dc:creator>mrizvi</dc:creator>
      <dc:date>2017-04-18T08:50:47Z</dc:date>
    </item>
    <item>
      <title>Re: Getting exception while inserting the file having multi delimiters in a Hive table</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202631#M164637</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/10331/mrizvi.html" nodeid="10331"&gt;@Mushtaq Rizvi&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;you can define tab delimited table. This will give you already 7 columns, without any regex for them.&lt;/P&gt;&lt;P&gt;then, in order to extract particular parts from each product column use "(\d+):([^-\t]+)-(\d+)" as regex (except for the first one, which would be just "(\d+):([^-\t]+)"...)&lt;/P&gt;&lt;P&gt;or you can also use split instead of regex.&lt;/P&gt;&lt;PRE&gt;insert overwrite table recommendation SELECT  
regexp_extract(col_p, '(\d+):', 1) 		productId,
regexp_extract(col_p, ':(.+)', 1) 		productName,  
regexp_extract(col_p1, '(\d+):', 1) 	productId1,  
regexp_extract(col_p1, ':([^-]+)', 1) 	productName1,  
regexp_extract(col_p1, '-(.+)', 1) 		productCount1,  
regexp_extract(col_p2, '(\d+):', 1) 	 productId2,  
regexp_extract(col_p2, ':([^-]+)', 1)  productName2,  
regexp_extract(col_p2, '-(.+)', 1) 		 productCount2,  
regexp_extract(col_p3, '(\d+):', 1) 	 productId3,  
regexp_extract(col_p3, ':([^-]+)', 1)  productName3,  
regexp_extract(col_p3, '-(.+)', 1) 		 productCount3,  
regexp_extract(col_p4, '(\d+):', 1) 	 productId4,  
regexp_extract(col_p4, ':([^-]+)', 1)  productName4,  
regexp_extract(col_p4, '-(.+)', 1) 		 productCount4,  
regexp_extract(col_p5, '(\d+):', 1) 	 productId5,  
regexp_extract(col_p5, ':([^-]+)', 1)  productName5,  
regexp_extract(col_p5, '-(.+)', 1) 		 productCount5,  
regexp_extract(col_p6, '(\d+):', 1) 	 productId6,  
regexp_extract(col_p6, ':([^-]+)', 1)  productName6,  
regexp_extract(col_p6, '-(.+)', 1) 		 productCount6
from temp_recommendation;
&lt;/PRE&gt;</description>
      <pubDate>Thu, 20 Apr 2017 19:59:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202631#M164637</guid>
      <dc:creator>eberezitsky</dc:creator>
      <dc:date>2017-04-20T19:59:51Z</dc:date>
    </item>
    <item>
      <title>Re: Getting exception while inserting the file having multi delimiters in a Hive table</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202632#M164638</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/10331/mrizvi.html" nodeid="10331"&gt;@Mushtaq Rizvi&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;if that worked for you, please accept the answer so your question will be marked as resolved. Thanks.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Apr 2017 03:15:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202632#M164638</guid>
      <dc:creator>eberezitsky</dc:creator>
      <dc:date>2017-04-22T03:15:51Z</dc:date>
    </item>
    <item>
      <title>Re: Getting exception while inserting the file having multi delimiters in a Hive table</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202633#M164639</link>
      <description>&lt;P&gt;Thanks for the response. I am still getting the same exception while doing regexp_extract.&lt;/P&gt;</description>
      <pubDate>Sun, 23 Apr 2017 02:17:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202633#M164639</guid>
      <dc:creator>mrizvi</dc:creator>
      <dc:date>2017-04-23T02:17:25Z</dc:date>
    </item>
    <item>
      <title>Re: Getting exception while inserting the file having multi delimiters in a Hive table</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202634#M164640</link>
      <description>&lt;P&gt;Just saw that any query in Hive which involves map reduce job is giving the same exception.&lt;/P&gt;</description>
      <pubDate>Sun, 23 Apr 2017 02:19:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202634#M164640</guid>
      <dc:creator>mrizvi</dc:creator>
      <dc:date>2017-04-23T02:19:35Z</dc:date>
    </item>
    <item>
      <title>Re: Getting exception while inserting the file having multi delimiters in a Hive table</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202635#M164641</link>
      <description>&lt;P&gt;Got it right. Actually whenever I was starting my hive shell, I was getting this warning:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;So I installed Tez(version 0.8.5) and changed the execution engine of Hive to Tez. Now all Hive queries that involve MapReduce job are running. My hive version is 2.1.1, that I guess do not work with MapReduce&lt;/P&gt;&lt;P&gt;As per the regex, thanks alot &lt;A rel="user" href="https://community.cloudera.com/users/14209/eberezitsky.html" nodeid="14209"&gt;@Ed Berezitsky, 
&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/14209/eberezitsky.html" nodeid="14209"&gt;Those regex worked.
&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;/STRONG&gt;
&lt;STRONG&gt;&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 23 Apr 2017 05:25:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Getting-exception-while-inserting-the-file-having-multi/m-p/202635#M164641</guid>
      <dc:creator>mrizvi</dc:creator>
      <dc:date>2017-04-23T05:25:55Z</dc:date>
    </item>
  </channel>
</rss>

