<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question How to load a bag from a file in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-load-a-bag-from-a-file/m-p/164799#M21408</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I was trying to load a file in Pig which contains data like :&lt;/P&gt;&lt;P&gt;{(3),(mary),(19)}&lt;/P&gt;&lt;P&gt;{(1),(john),(18)}&lt;/P&gt;&lt;P&gt;{(2),(joe),(18)}&lt;/P&gt;&lt;P&gt;Following command is falling :&lt;/P&gt;&lt;P&gt;A = LOAD 'data3' AS (B: bag {T: tuple(t1:int), F:tuple(f1:chararray), G:tuple(g1:int)});&lt;/P&gt;&lt;P&gt;How to do it in correct way ?&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Soumya&lt;/P&gt;</description>
    <pubDate>Tue, 01 Mar 2016 16:53:00 GMT</pubDate>
    <dc:creator>soumyabrata_kol</dc:creator>
    <dc:date>2016-03-01T16:53:00Z</dc:date>
    <item>
      <title>How to load a bag from a file</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-load-a-bag-from-a-file/m-p/164799#M21408</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I was trying to load a file in Pig which contains data like :&lt;/P&gt;&lt;P&gt;{(3),(mary),(19)}&lt;/P&gt;&lt;P&gt;{(1),(john),(18)}&lt;/P&gt;&lt;P&gt;{(2),(joe),(18)}&lt;/P&gt;&lt;P&gt;Following command is falling :&lt;/P&gt;&lt;P&gt;A = LOAD 'data3' AS (B: bag {T: tuple(t1:int), F:tuple(f1:chararray), G:tuple(g1:int)});&lt;/P&gt;&lt;P&gt;How to do it in correct way ?&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Soumya&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2016 16:53:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-load-a-bag-from-a-file/m-p/164799#M21408</guid>
      <dc:creator>soumyabrata_kol</dc:creator>
      <dc:date>2016-03-01T16:53:00Z</dc:date>
    </item>
    <item>
      <title>Re: How to load a bag from a file</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-load-a-bag-from-a-file/m-p/164800#M21409</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/2799/soumyabratakole.html" nodeid="2799"&gt;@soumyabrata kole&lt;/A&gt;
&lt;/P&gt;&lt;P&gt; See this&lt;/P&gt;&lt;P&gt;&lt;A href="http://datafu.incubator.apache.org/docs/datafu/guide/bag-operations.html" target="_blank"&gt;http://datafu.incubator.apache.org/docs/datafu/guide/bag-operations.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2016 17:12:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-load-a-bag-from-a-file/m-p/164800#M21409</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-03-01T17:12:01Z</dc:date>
    </item>
    <item>
      <title>Re: How to load a bag from a file</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-load-a-bag-from-a-file/m-p/164801#M21410</link>
      <description>&lt;P&gt;I don't think there is a Pig Storage handler that does that. Which is a bit weird I suppose. How did you generate that file? Just test data you did manually? &lt;/P&gt;&lt;P&gt;PigStorage essentially reads writes delimited files, tuples can be Maps/bags but I don't think the main record can be.&lt;/P&gt;&lt;P&gt;JsonStorage is Json format which is different syntax. Then there is BinStorage which I suppose is some kind of Sequence file.&lt;/P&gt;&lt;P&gt;I might just not see that but I think there is no way in Pig natively without some transformations to read data in the format he prints it on for debugging. Please someone correct me if I am wrong. &lt;/P&gt;&lt;P&gt;&lt;A href="http://pig.apache.org/docs/r0.14.0/func.html#load-store-functions" target="_blank"&gt;http://pig.apache.org/docs/r0.14.0/func.html#load-store-functions&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Mar 2016 17:42:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-load-a-bag-from-a-file/m-p/164801#M21410</guid>
      <dc:creator>bleonhardi</dc:creator>
      <dc:date>2016-03-01T17:42:38Z</dc:date>
    </item>
    <item>
      <title>Re: How to load a bag from a file</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-load-a-bag-from-a-file/m-p/164802#M21411</link>
      <description>&lt;P&gt;Load the data using pig storage and then run tobag function &lt;A href="http://pig.apache.org/docs/r0.15.0/func.html#tobag" target="_blank"&gt;http://pig.apache.org/docs/r0.15.0/func.html#tobag&lt;/A&gt; is it a comma separated file?&lt;/P&gt;&lt;PRE&gt;a = LOAD 'student' AS (f1:chararray, f2:int, f3:float);
DUMP a;

(John,18,4.0)
(Mary,19,3.8)
(Bill,20,3.9)
(Joe,18,3.8)

b = FOREACH a GENERATE TOBAG(f1,f3);
DUMP b;

({(John),(4.0)})
({(Mary),(3.8)})
({(Bill),(3.9)})
({(Joe),(3.8)})&lt;/PRE&gt;</description>
      <pubDate>Tue, 01 Mar 2016 19:25:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-load-a-bag-from-a-file/m-p/164802#M21411</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-03-01T19:25:19Z</dc:date>
    </item>
  </channel>
</rss>

