<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: STORE Pig OUTPUT into MULTIPLE HBase TABLES in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141258#M39987</link>
    <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/users/393/aervits.html"&gt;@Artem Ervits&lt;/A&gt; thanks for your valuable explanation.&lt;/P&gt;&lt;P&gt;By using that i have tried it in another way.&lt;/P&gt;&lt;P&gt;I.e without storing the output to a text file and again loading back by using pigstorage, before itself i have tried to filter based on word and tried to store it in hbase.&lt;/P&gt;&lt;P&gt;Above I have mentioned only the scenario what i need.but here is the actual script and data that i have used.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Output &amp;amp; Script:&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;A = foreach (group epoch BY epochtime) { data = foreach epoch generate created_at,id,user_id,text; generate group as pattern, data; }

By using this I got the below output

(word1_1473344765_265217609700,{(Wed Apr 20 07:23:20 +0000 2016,252479809098223616,450990391,rt @joey7barton: ..give a word1 about whether the americans wins a ryder cup. i mean surely he has slightly more important matters. #fami ...),(Wed Apr 22 07:23:20 +0000 2016,252455630361747457,118179886,@dawnriseth word1 and then we will have to prove it again by reelecting obama in 2016, 2020... this race-baiting never ends.)}) 
(word2_1473344765_265217609700,{(Wed Apr 21 07:23:20 +0000 2016,252370526411051008,845912316,@maarionymcmb word2 mere ta dit tu va resté chez toi dnc tu restes !),(Wed Apr 23 07:23:20 +0000 2016,252213169567711232,14596856,rt @chernynkaya: "have you noticed lately that word2 is getting credit for the president being in the lead except pres. obama?"  ...)})

Now without dump or storing it into a file, I tried this.

B = FILTER A BY pattern = 'word1_1473325383_265214120940';
describe B;

B: {pattern: chararray,data: {(json::created_at: chararray,json::id: chararray,json::user_id: chararray,json::text: chararray)}}

STORE B into 'hbase://word1_1473325383_265214120940' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:data');
&lt;/PRE&gt;&lt;P&gt;Output given as success but there is no data stored into table.When I checked the logs below is the &lt;STRONG&gt;warning.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;2016-09-08 19:45:46,223 [Readahead Thread #2] WARN  org.apache.hadoop.io.ReadaheadPool - Failed readahead on ifile
EBADF: Bad file descriptor
&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Please don't hesitate to suggest me what I am missing here.&lt;/P&gt;&lt;P&gt;thank you.&lt;/P&gt;</description>
    <pubDate>Thu, 08 Sep 2016 21:37:42 GMT</pubDate>
    <dc:creator>mohan221213</dc:creator>
    <dc:date>2016-09-08T21:37:42Z</dc:date>
    <item>
      <title>STORE Pig OUTPUT into MULTIPLE HBase TABLES</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141254#M39983</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;How can we store the output of pig into multiple hbase tables. Hbase tables are already created, need to store the each specific value into specific table.&lt;/P&gt;&lt;P&gt;For EX:&lt;/P&gt;&lt;P&gt;I have got the output as&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;(word1){data}
(word2){data}
(word3){data}
(word4){data}
&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;So I need to store output into already created tables. Table Names are like&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;word1
word2
word3
word4
&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Now output should be store in already created tables as&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;word1 ----&amp;gt; (word1){data}
word2 ----&amp;gt; (word2){data} 
word3 ----&amp;gt; (word3){data}           
&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Any suggestions.&lt;/P&gt;&lt;P&gt;thank you.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Sep 2016 20:34:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141254#M39983</guid>
      <dc:creator>mohan221213</dc:creator>
      <dc:date>2016-09-07T20:34:15Z</dc:date>
    </item>
    <item>
      <title>Re: STORE Pig OUTPUT into MULTIPLE HBase TABLES</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141255#M39984</link>
      <description>&lt;P&gt;you would need to assign an alias to each row and specify separate store command per row.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Sep 2016 20:44:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141255#M39984</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-09-07T20:44:38Z</dc:date>
    </item>
    <item>
      <title>Re: STORE Pig OUTPUT into MULTIPLE HBase TABLES</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141256#M39985</link>
      <description>&lt;P&gt;thanks for your reply &lt;A href="https://community.hortonworks.com/users/393/aervits.html"&gt;Artem Ervits&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;can you please give me an example for that.It will be so helpfull for me. &lt;/P&gt;</description>
      <pubDate>Wed, 07 Sep 2016 20:49:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141256#M39985</guid>
      <dc:creator>mohan221213</dc:creator>
      <dc:date>2016-09-07T20:49:45Z</dc:date>
    </item>
    <item>
      <title>Re: STORE Pig OUTPUT into MULTIPLE HBase TABLES</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141257#M39986</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/10889/mohan221213.html" nodeid="10889"&gt;@Mohan V&lt;/A&gt; this is not efficient but does what you're asking&lt;/P&gt;&lt;PRE&gt;grunt&amp;gt; fs -cat text
1 a
2 b
3 c
grunt&amp;gt; data = load 'text' using PigStorage(' ') AS (id:long, letter:chararray);
grunt&amp;gt; A = FILTER data by letter == 'a';
grunt&amp;gt; B = FILTER data by letter == 'b';
grunt&amp;gt; C = FILTER data by letter == 'c';
grunt&amp;gt; STORE A into 'hbase://a' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:letter');
2016-09-07 16:04:29,421 [main] INFO  org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 698875904 to monitor. collectionUsageThreshold = 489213120, usageThreshold = 489213120
...
grunt&amp;gt; STORE B into 'hbase://b' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:letter');
...
grunt&amp;gt; STORE C into 'hbase://c' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:letter');&lt;/PRE&gt;
&lt;PRE&gt; now in hbase shell assuming tables were created
&lt;/PRE&gt;
&lt;PRE&gt;create 'a', 'cf'
create 'b', 'cf'
create 'c', 'cf'&lt;/PRE&gt;&lt;PRE&gt;hbase(main):001:0&amp;gt; scan 'a'
ROW                            COLUMN+CELL
 1                             column=cf:letter, timestamp=1473264279802, value=a
1 row(s) in 0.2610 seconds


hbase(main):002:0&amp;gt; scan 'b'
ROW                            COLUMN+CELL
 2                             column=cf:letter, timestamp=1473264324881, value=b
1 row(s) in 0.0160 seconds


hbase(main):003:0&amp;gt; scan 'c'
ROW                            COLUMN+CELL
 3                             column=cf:letter, timestamp=1473264429688, value=c
1 row(s) in 0.0140 seconds

&lt;/PRE&gt;</description>
      <pubDate>Wed, 07 Sep 2016 23:08:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141257#M39986</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-09-07T23:08:44Z</dc:date>
    </item>
    <item>
      <title>Re: STORE Pig OUTPUT into MULTIPLE HBase TABLES</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141258#M39987</link>
      <description>&lt;P&gt;&lt;A href="https://community.hortonworks.com/users/393/aervits.html"&gt;@Artem Ervits&lt;/A&gt; thanks for your valuable explanation.&lt;/P&gt;&lt;P&gt;By using that i have tried it in another way.&lt;/P&gt;&lt;P&gt;I.e without storing the output to a text file and again loading back by using pigstorage, before itself i have tried to filter based on word and tried to store it in hbase.&lt;/P&gt;&lt;P&gt;Above I have mentioned only the scenario what i need.but here is the actual script and data that i have used.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Output &amp;amp; Script:&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;A = foreach (group epoch BY epochtime) { data = foreach epoch generate created_at,id,user_id,text; generate group as pattern, data; }

By using this I got the below output

(word1_1473344765_265217609700,{(Wed Apr 20 07:23:20 +0000 2016,252479809098223616,450990391,rt @joey7barton: ..give a word1 about whether the americans wins a ryder cup. i mean surely he has slightly more important matters. #fami ...),(Wed Apr 22 07:23:20 +0000 2016,252455630361747457,118179886,@dawnriseth word1 and then we will have to prove it again by reelecting obama in 2016, 2020... this race-baiting never ends.)}) 
(word2_1473344765_265217609700,{(Wed Apr 21 07:23:20 +0000 2016,252370526411051008,845912316,@maarionymcmb word2 mere ta dit tu va resté chez toi dnc tu restes !),(Wed Apr 23 07:23:20 +0000 2016,252213169567711232,14596856,rt @chernynkaya: "have you noticed lately that word2 is getting credit for the president being in the lead except pres. obama?"  ...)})

Now without dump or storing it into a file, I tried this.

B = FILTER A BY pattern = 'word1_1473325383_265214120940';
describe B;

B: {pattern: chararray,data: {(json::created_at: chararray,json::id: chararray,json::user_id: chararray,json::text: chararray)}}

STORE B into 'hbase://word1_1473325383_265214120940' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('cf:data');
&lt;/PRE&gt;&lt;P&gt;Output given as success but there is no data stored into table.When I checked the logs below is the &lt;STRONG&gt;warning.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;2016-09-08 19:45:46,223 [Readahead Thread #2] WARN  org.apache.hadoop.io.ReadaheadPool - Failed readahead on ifile
EBADF: Bad file descriptor
&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Please don't hesitate to suggest me what I am missing here.&lt;/P&gt;&lt;P&gt;thank you.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Sep 2016 21:37:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/STORE-Pig-OUTPUT-into-MULTIPLE-HBase-TABLES/m-p/141258#M39987</guid>
      <dc:creator>mohan221213</dc:creator>
      <dc:date>2016-09-08T21:37:42Z</dc:date>
    </item>
  </channel>
</rss>

