<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Pig - Load data from two different path in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188734#M61896</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1373/satish-sarapuri.html" nodeid="1373"&gt;@Satish Sarapuri&lt;/A&gt; Yes, you can GLOB the filename pattern.  This will work work:&lt;/P&gt;&lt;PRE&gt;Source = LOAD '/data/input{1,2}.csv' USING PigStorage(,)...&lt;/PRE&gt;&lt;P&gt;You can use other GLOB patterns.  See &lt;A href="https://books.google.com/books?id=Nff49D7vnJcC&amp;amp;pg=PA60&amp;amp;lpg=PA60&amp;amp;dq=hdfs+glob&amp;amp;source=bl&amp;amp;ots=IjkvXt9zUn&amp;amp;sig=AKjzNQ77C9BaRgZyqvkJ4YFI7gU&amp;amp;hl=en&amp;amp;sa=X&amp;amp;ved=0ahUKEwirt5_O_I_UAhUE1CYKHTtCDqIQ6AEITzAH#v=onepage&amp;amp;q=hdfs%20glob&amp;amp;f=false" target="_blank"&gt;https://books.google.com/books?id=Nff49D7vnJcC&amp;amp;pg=PA60&amp;amp;lpg=PA60&amp;amp;dq=hdfs+glob&amp;amp;source=bl&amp;amp;ots=IjkvXt9zUn&amp;amp;sig=AKjzNQ77C9BaRgZyqvkJ4YFI7gU&amp;amp;hl=en&amp;amp;sa=X&amp;amp;ved=0ahUKEwirt5_O_I_UAhUE1CYKHTtCDqIQ6AEITzAH#v=onepage&amp;amp;q=hdfs%20glob&amp;amp;f=false&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 27 May 2017 18:31:54 GMT</pubDate>
    <dc:creator>gkeys</dc:creator>
    <dc:date>2017-05-27T18:31:54Z</dc:date>
    <item>
      <title>Pig - Load data from two different path</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188731#M61893</link>
      <description>&lt;P&gt;Hi Friends,&lt;/P&gt;&lt;P&gt;I have question on Pig script. I have to load data from two different HDFS paths into single Pig relation. &lt;/P&gt;&lt;P&gt;Ex: /data/input1.csv and another file is in /inputdata/input1 or input2.csv.&lt;/P&gt;&lt;P&gt;Is it possible to load these two tables in to single Pig relation?&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Satish.&lt;/P&gt;</description>
      <pubDate>Sat, 27 May 2017 09:23:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188731#M61893</guid>
      <dc:creator>SatishS</dc:creator>
      <dc:date>2017-05-27T09:23:29Z</dc:date>
    </item>
    <item>
      <title>Re: Pig - Load data from two different path</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188732#M61894</link>
      <description>&lt;P&gt;If these two sources have the same schema it is a simple manner of using the UNION operator to do these three steps:&lt;/P&gt;&lt;PRE&gt;Source_1 = LOAD "/data/input1.csv" USING PigStorage(',') ...
Source_2 = LOAD "/data/input2.csv" USING PigStorage(',') ...
Source = UNION Source_1, Source_2;&lt;/PRE&gt;&lt;P&gt;See these references for elaboration:&lt;/P&gt;&lt;UL&gt;
&lt;LI&gt;&lt;A href="https://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#UNION" target="_blank"&gt;https://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#UNION&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;&lt;A href="https://www.tutorialspoint.com/apache_pig/apache_pig_union_operator.htm" target="_blank"&gt;https://www.tutorialspoint.com/apache_pig/apache_pig_union_operator.htm&lt;/A&gt;&lt;/LI&gt;&lt;LI&gt;&lt;A href="https://stackoverflow.com/questions/10954883/storing-results-of-union-in-pig-in-a-single-file" target="_blank"&gt;https://stackoverflow.com/questions/10954883/storing-results-of-union-in-pig-in-a-single-file&lt;/A&gt;&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Sat, 27 May 2017 11:02:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188732#M61894</guid>
      <dc:creator>gkeys</dc:creator>
      <dc:date>2017-05-27T11:02:48Z</dc:date>
    </item>
    <item>
      <title>Re: Pig - Load data from two different path</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188733#M61895</link>
      <description>&lt;P&gt;Thanks Greg, but is there anyway to load both files from different path into single relation using LOAD?&lt;/P&gt;</description>
      <pubDate>Sat, 27 May 2017 16:09:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188733#M61895</guid>
      <dc:creator>SatishS</dc:creator>
      <dc:date>2017-05-27T16:09:11Z</dc:date>
    </item>
    <item>
      <title>Re: Pig - Load data from two different path</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188734#M61896</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1373/satish-sarapuri.html" nodeid="1373"&gt;@Satish Sarapuri&lt;/A&gt; Yes, you can GLOB the filename pattern.  This will work work:&lt;/P&gt;&lt;PRE&gt;Source = LOAD '/data/input{1,2}.csv' USING PigStorage(,)...&lt;/PRE&gt;&lt;P&gt;You can use other GLOB patterns.  See &lt;A href="https://books.google.com/books?id=Nff49D7vnJcC&amp;amp;pg=PA60&amp;amp;lpg=PA60&amp;amp;dq=hdfs+glob&amp;amp;source=bl&amp;amp;ots=IjkvXt9zUn&amp;amp;sig=AKjzNQ77C9BaRgZyqvkJ4YFI7gU&amp;amp;hl=en&amp;amp;sa=X&amp;amp;ved=0ahUKEwirt5_O_I_UAhUE1CYKHTtCDqIQ6AEITzAH#v=onepage&amp;amp;q=hdfs%20glob&amp;amp;f=false" target="_blank"&gt;https://books.google.com/books?id=Nff49D7vnJcC&amp;amp;pg=PA60&amp;amp;lpg=PA60&amp;amp;dq=hdfs+glob&amp;amp;source=bl&amp;amp;ots=IjkvXt9zUn&amp;amp;sig=AKjzNQ77C9BaRgZyqvkJ4YFI7gU&amp;amp;hl=en&amp;amp;sa=X&amp;amp;ved=0ahUKEwirt5_O_I_UAhUE1CYKHTtCDqIQ6AEITzAH#v=onepage&amp;amp;q=hdfs%20glob&amp;amp;f=false&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 27 May 2017 18:31:54 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188734#M61896</guid>
      <dc:creator>gkeys</dc:creator>
      <dc:date>2017-05-27T18:31:54Z</dc:date>
    </item>
    <item>
      <title>Re: Pig - Load data from two different path</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188735#M61897</link>
      <description>&lt;P&gt;@Grey Keys, both source data is in different paths.&lt;/P&gt;</description>
      <pubDate>Sat, 27 May 2017 21:50:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188735#M61897</guid>
      <dc:creator>SatishS</dc:creator>
      <dc:date>2017-05-27T21:50:31Z</dc:date>
    </item>
    <item>
      <title>Re: Pig - Load data from two different path</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188736#M61898</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1373/satish-sarapuri.html" nodeid="1373"&gt;@Satish Sarapuri&lt;/A&gt; You can use globs anywhere in the path (not just the filename).  There are quite many operators for globs (similar to linux) as shown in the above link, so if there is enough in common with the paths you should be able to leverage globs for the differing parts.  If none of that works, you could still use the globs with full paths:&lt;/P&gt;&lt;PRE&gt;Source = LOAD '/{path1,path2}' USING PigStorage(,)...
where path1 and path2 can be any file path.&lt;/PRE&gt;</description>
      <pubDate>Tue, 30 May 2017 19:54:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188736#M61898</guid>
      <dc:creator>gkeys</dc:creator>
      <dc:date>2017-05-30T19:54:40Z</dc:date>
    </item>
    <item>
      <title>Re: Pig - Load data from two different path</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188737#M61899</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11288/gkeys.html" nodeid="11288"&gt;@Greg Keys&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Both source files are in different paths.&lt;/P&gt;</description>
      <pubDate>Mon, 05 Jun 2017 05:15:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188737#M61899</guid>
      <dc:creator>SatishS</dc:creator>
      <dc:date>2017-06-05T05:15:05Z</dc:date>
    </item>
    <item>
      <title>Re: Pig - Load data from two different path</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188738#M61900</link>
      <description>&lt;P&gt;Hi, I am a new one of Big Data. This code is like Union? So mean, &lt;A rel="user" href="https://community.cloudera.com/users/11288/gkeys.html" nodeid="11288"&gt;@Greg Keys&lt;/A&gt; you write two codes. They are working same? Thank you for answering... &lt;/P&gt;</description>
      <pubDate>Thu, 11 Apr 2019 01:24:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Pig-Load-data-from-two-different-path/m-p/188738#M61900</guid>
      <dc:creator>broute</dc:creator>
      <dc:date>2019-04-11T01:24:42Z</dc:date>
    </item>
  </channel>
</rss>

