<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Load a csv file in PIG as tuple in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135975#M27557</link>
    <description>&lt;P&gt;Thanks a lot for the help. However, I am now testing with your dataset and code.&lt;/P&gt;&lt;P&gt;dataset --&lt;/P&gt;&lt;P&gt;(3,8,9) (4,5,6) &lt;/P&gt;&lt;P&gt;(1,4,7) (3,7,5)&lt;/P&gt;&lt;P&gt;
(2,5,8) (9,5,8)&lt;/P&gt;&lt;P&gt;code --&lt;/P&gt;&lt;P&gt;A = LOAD '/temp/test.csv' USING PigStorage('\t')
As (t1:tuple(t1a:int, t1b:int,t1c:int), t2:tuple(t2a:int,t2b:int,t2c:int));&lt;/P&gt;&lt;P&gt;
X = FOREACH A GENERATE t1.t1a, t2.t2a; &lt;/P&gt;&lt;P&gt;DUMP X;&lt;/P&gt;&lt;P&gt;result --&lt;/P&gt;&lt;P&gt;(3,) &lt;/P&gt;&lt;P&gt;(1,)&lt;/P&gt;&lt;P&gt;
(2,)&lt;/P&gt;&lt;P&gt;Don't understand why it is not reading the 2nd tuple. Can you help ?&lt;/P&gt;</description>
    <pubDate>Sat, 07 May 2016 14:10:22 GMT</pubDate>
    <dc:creator>subhasis_roy</dc:creator>
    <dc:date>2016-05-07T14:10:22Z</dc:date>
    <item>
      <title>Load a csv file in PIG as tuple</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135973#M27555</link>
      <description>&lt;P&gt;I have a file like:&lt;/P&gt;&lt;P&gt;id,name,deg,salary,dept &lt;/P&gt;&lt;P&gt;1201,gopal, manager, 50000, TP &lt;/P&gt;&lt;P&gt;1202,manisha, proof reader, 50000, TP&lt;/P&gt;&lt;P&gt;I am trying to load this in PIG using tuple as below:&lt;/P&gt;&lt;P&gt;A = LOAD '/mydir/emp.txt' USING PigStorage(',')
AS (t:tuple(a:chararray, b:chararray, c:chararray, d:chararray, e:chararray)); &lt;/P&gt;&lt;P&gt;X = FOREACH A GENERATE t.$0, t.$1, t.$2, t.$3, t.$4;&lt;/P&gt;&lt;P&gt;
DUMP X;&lt;/P&gt;&lt;P&gt;I am getting a result like :&lt;/P&gt;&lt;P&gt;(,,,,)&lt;/P&gt;&lt;P&gt;
(,,,,)&lt;/P&gt;&lt;P&gt;Can somebody help me in understanding the reason behind this issue ?&lt;/P&gt;</description>
      <pubDate>Sat, 07 May 2016 05:46:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135973#M27555</guid>
      <dc:creator>subhasis_roy</dc:creator>
      <dc:date>2016-05-07T05:46:17Z</dc:date>
    </item>
    <item>
      <title>Re: Load a csv file in PIG as tuple</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135974#M27556</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/3694/subhasisroy.html" nodeid="3694"&gt;@Subhasis Roy&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Tuples are used to represent complex data types. Tuples are between parentheses like in this example:&lt;/P&gt;&lt;PRE&gt;cat data
(3,8,9) (4,5,6)
(1,4,7) (3,7,5)
(2,5,8) (9,5,8)

A = LOAD 'data' AS (t1:tuple(t1a:int, t1b:int,t1c:int),t2:tuple(t2a:int,t2b:int,t2c:int));
X = FOREACH A GENERATE t1.t1a,t2.$0;

DUMP X;
(3,4)
(1,3)
(2,9)&lt;/PRE&gt;&lt;P&gt;In your case, your data is simple and not between parentheses so you don't need to use tuple in your schema. Just run this &lt;/P&gt;&lt;PRE&gt;A = LOAD '/tmp/test.csv' USING PigStorage(',') AS (a:chararray, b:chararray, c:chararray, d:chararray, e:chararray);

DUMP A;
(1201,gopal, manager, 50000, TP)
(1202,manisha, proof reader, 50000, TP)
&lt;/PRE&gt;&lt;P&gt;If you want to access only some fields of your data you use this (here I show only the 4 first fields):&lt;/P&gt;&lt;PRE&gt;X = FOREACH A GENERATE $0, $1, $2, $3;
DUMP X;
(1201,gopal, manager, 50000)
(1202,manisha, proof reader, 50000)&lt;/PRE&gt;&lt;P&gt;Does this answer your question ?&lt;/P&gt;</description>
      <pubDate>Sat, 07 May 2016 07:47:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135974#M27556</guid>
      <dc:creator>ahadjidj</dc:creator>
      <dc:date>2016-05-07T07:47:36Z</dc:date>
    </item>
    <item>
      <title>Re: Load a csv file in PIG as tuple</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135975#M27557</link>
      <description>&lt;P&gt;Thanks a lot for the help. However, I am now testing with your dataset and code.&lt;/P&gt;&lt;P&gt;dataset --&lt;/P&gt;&lt;P&gt;(3,8,9) (4,5,6) &lt;/P&gt;&lt;P&gt;(1,4,7) (3,7,5)&lt;/P&gt;&lt;P&gt;
(2,5,8) (9,5,8)&lt;/P&gt;&lt;P&gt;code --&lt;/P&gt;&lt;P&gt;A = LOAD '/temp/test.csv' USING PigStorage('\t')
As (t1:tuple(t1a:int, t1b:int,t1c:int), t2:tuple(t2a:int,t2b:int,t2c:int));&lt;/P&gt;&lt;P&gt;
X = FOREACH A GENERATE t1.t1a, t2.t2a; &lt;/P&gt;&lt;P&gt;DUMP X;&lt;/P&gt;&lt;P&gt;result --&lt;/P&gt;&lt;P&gt;(3,) &lt;/P&gt;&lt;P&gt;(1,)&lt;/P&gt;&lt;P&gt;
(2,)&lt;/P&gt;&lt;P&gt;Don't understand why it is not reading the 2nd tuple. Can you help ?&lt;/P&gt;</description>
      <pubDate>Sat, 07 May 2016 14:10:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135975#M27557</guid>
      <dc:creator>subhasis_roy</dc:creator>
      <dc:date>2016-05-07T14:10:22Z</dc:date>
    </item>
    <item>
      <title>Re: Load a csv file in PIG as tuple</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135976#M27558</link>
      <description>&lt;P&gt;I think the issue was with the formatting of the data file. Problem is resolved now. Thanks a lot for the help.&lt;/P&gt;</description>
      <pubDate>Sat, 07 May 2016 15:21:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135976#M27558</guid>
      <dc:creator>subhasis_roy</dc:creator>
      <dc:date>2016-05-07T15:21:21Z</dc:date>
    </item>
    <item>
      <title>Re: Load a csv file in PIG as tuple</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135977#M27559</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I am facing same issue, can you please help to resolve this issue&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Sam&lt;/P&gt;</description>
      <pubDate>Mon, 27 Mar 2017 01:07:47 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Load-a-csv-file-in-PIG-as-tuple/m-p/135977#M27559</guid>
      <dc:creator>SamPatil</dc:creator>
      <dc:date>2017-03-27T01:07:47Z</dc:date>
    </item>
  </channel>
</rss>

