<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How To Process Data with Apache Pig tutorial SLOW in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129826#M18372</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/122/lmartin.html" nodeid="122"&gt;@Lester Martin&lt;/A&gt;  Thanks for testing and confirming. I think you should publish article based on your comments&lt;/P&gt;</description>
    <pubDate>Wed, 10 Feb 2016 06:46:09 GMT</pubDate>
    <dc:creator>nsabharwal</dc:creator>
    <dc:date>2016-02-10T06:46:09Z</dc:date>
    <item>
      <title>How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129818#M18364</link>
      <description>&lt;P&gt;Hello all - &lt;/P&gt;&lt;P&gt;Just a quick LOW PRIORITY question for anyone who has run the tutorial "&lt;A href="http://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-pig/"&gt;How To Process Data with Apache Pig&lt;/A&gt;". &lt;/P&gt;&lt;P&gt;I created the script, and running the job as I write this. It has been running for 2 hours. Does this seem SLOW to anyone else? &lt;/P&gt;&lt;P&gt;I am running on a machine with an i7 processor, have 16 Gb of RAM, of which the Ambari Sandbox is utilizing 8 Gb.  Are there other configuration options that should be set? Although - this seems like a massive amount of resources in use already. &lt;/P&gt;</description>
      <pubDate>Mon, 08 Feb 2016 04:49:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129818#M18364</guid>
      <dc:creator>mvogt</dc:creator>
      <dc:date>2016-02-08T04:49:10Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129819#M18365</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1491/mvogt.html" nodeid="1491"&gt;@Mike Vogt&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Have you configured yarn queues?&lt;/P&gt;&lt;P&gt;There is high probability that some other job is consuming all the resources&lt;/P&gt;&lt;P&gt;Check RM ui from ambari &lt;/P&gt;</description>
      <pubDate>Mon, 08 Feb 2016 04:51:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129819#M18365</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-08T04:51:32Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129820#M18366</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1491/mvogt.html" nodeid="1491"&gt;@Mike Vogt&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Make sure core components are up &lt;/P&gt;&lt;P&gt;Hdfs &lt;/P&gt;&lt;P&gt;Yarn &lt;/P&gt;&lt;P&gt;Mapreduce &lt;/P&gt;</description>
      <pubDate>Mon, 08 Feb 2016 04:54:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129820#M18366</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-08T04:54:53Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129821#M18367</link>
      <description>&lt;P&gt;Your genius level skills shine through once again! Thanks very much! &lt;/P&gt;</description>
      <pubDate>Tue, 09 Feb 2016 09:29:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129821#M18367</guid>
      <dc:creator>mvogt</dc:creator>
      <dc:date>2016-02-09T09:29:38Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129822#M18368</link>
      <description>&lt;P&gt;I just ran this tutorial on my 16GB i7 MBPro (gave the VM 8GB just as you) and could get it to run in 100 secs with MR and about 65 secs using Tez.  I then ran the same script from the CLI and got those times down to about 60 and 25 secs on MR and Tez, respectively.  I'm using the 2.3.2 Sandbox and the only thing I had to do was start the History Server was showing up red in Ambari.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Feb 2016 06:26:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129822#M18368</guid>
      <dc:creator>LesterMartin</dc:creator>
      <dc:date>2016-02-10T06:26:17Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129823#M18369</link>
      <description>&lt;P&gt;Tez benefits from warm containers so consecutive execution of same scripts should be better. Didn't know MR was performing better in CLI, can't explain that &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; &lt;A rel="user" href="https://community.cloudera.com/users/122/lmartin.html" nodeid="122"&gt;@Lester Martin&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Feb 2016 06:28:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129823#M18369</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-02-10T06:28:13Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129824#M18370</link>
      <description>&lt;P&gt;It ~seems~ that the Ambari Views were adding about 30 seconds to the run times.  Here's some of my notes around timings; notice the actual log-reported job times are pretty consistent from CLI and View runs.&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;Ran From&lt;/TD&gt;&lt;TD&gt;Exec Eng&lt;/TD&gt;&lt;TD&gt;Job Time&lt;/TD&gt;&lt;TD&gt;Clock Time&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Ambari View&lt;/TD&gt;&lt;TD&gt;MR&lt;/TD&gt;&lt;TD&gt;64 sec&lt;/TD&gt;&lt;TD&gt;103 sec&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;Ambari View&lt;/TD&gt;&lt;TD&gt;Tez&lt;/TD&gt;&lt;TD&gt;25 sec&lt;/TD&gt;&lt;TD&gt;63 sec&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;CLI&lt;/TD&gt;&lt;TD&gt;MR&lt;/TD&gt;&lt;TD&gt;59 sec&lt;/TD&gt;&lt;TD&gt;61 sec&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;CLI&lt;/TD&gt;&lt;TD&gt;Tez&lt;/TD&gt;&lt;TD&gt;25 sec&lt;/TD&gt;&lt;TD&gt;27 sec&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;Actual job times were consistent for each execution engine (Tez twice as fast), but Ambari View ~seemed~ to add 30+ secs overall.  I'm sure my the extremely constrained HDP stack on a tiny little psuedo-cluster (aka the Sandbox) is a big factor in this (understandable).&lt;/P&gt;</description>
      <pubDate>Wed, 10 Feb 2016 06:39:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129824#M18370</guid>
      <dc:creator>LesterMartin</dc:creator>
      <dc:date>2016-02-10T06:39:40Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129825#M18371</link>
      <description>&lt;P&gt;Yep, my History Server was down and had to be manually started.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Feb 2016 06:41:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129825#M18371</guid>
      <dc:creator>LesterMartin</dc:creator>
      <dc:date>2016-02-10T06:41:44Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129826#M18372</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/122/lmartin.html" nodeid="122"&gt;@Lester Martin&lt;/A&gt;  Thanks for testing and confirming. I think you should publish article based on your comments&lt;/P&gt;</description>
      <pubDate>Wed, 10 Feb 2016 06:46:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129826#M18372</guid>
      <dc:creator>nsabharwal</dc:creator>
      <dc:date>2016-02-10T06:46:09Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129827#M18373</link>
      <description>&lt;P&gt;I'm working with &lt;A rel="user" href="https://community.cloudera.com/users/69/rcoss.html" nodeid="69"&gt;@Rafael Coss&lt;/A&gt; to make sure the instructions are extremely crisp as I think there are a few things that could easily trip up a novice which is who we are targeting with these tutorials.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Feb 2016 06:48:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129827#M18373</guid>
      <dc:creator>LesterMartin</dc:creator>
      <dc:date>2016-02-10T06:48:39Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129828#M18374</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/1491/mvogt.html" nodeid="1491"&gt;@Mike Vogt&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/122/lmartin.html" nodeid="122"&gt;@Lester Martin &lt;/A&gt;&lt;A rel="user" href="https://community.cloudera.com/users/69/rcoss.html" nodeid="69"&gt;@Rafael Coss&lt;BR /&gt;&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Hello, I'm facing the same issue but by following the tutorial mentioned in:&lt;/P&gt;&lt;P&gt;&lt;A href="https://hortonworks.com/tutorial/hadoop-tutorial-getting-started-with-hdp/section/4/" target="_blank"&gt;https://hortonworks.com/tutorial/hadoop-tutorial-getting-started-with-hdp/section/4/&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;Once i execute my pig script, it is stuck in running status as mentioned in &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/43848-status.png"&gt;status.png&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;From RM UI, my application is also stuck in Running status as shown in &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/43849-rm-application.png"&gt;rm-application.png&lt;/A&gt; and i attached the launched job in MapReduce in &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/43850-mr-job.png"&gt;mr-job.png&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;From pig view log, i got &lt;A href="https://community.cloudera.com/legacyfs/online/attachments/43851-hive-log.png"&gt;hive-log.png&lt;/A&gt;.&lt;/P&gt;&lt;P&gt;How can i resolve my issue? I'll be really grateful if you could help me.&lt;/P&gt;&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/122/lmartin.html" nodeid="122"&gt;&lt;/A&gt;&lt;A rel="user" href="https://community.cloudera.com/users/69/rcoss.html" nodeid="69"&gt;&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Wed, 06 Dec 2017 19:56:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129828#M18374</guid>
      <dc:creator>kabadou_rawia</dc:creator>
      <dc:date>2017-12-06T19:56:08Z</dc:date>
    </item>
    <item>
      <title>Re: How To Process Data with Apache Pig tutorial SLOW</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129829#M18375</link>
      <description>&lt;P&gt;From looking at your RM UI it sure looks like both of these jobs are basically fighting each other to get running.  Meaning, the AppMaster containers are running, but they can't get anymore more containers to be run from YARN.  My recommendation would be to give the VM 10GB of memory (that's how I run it on my 16GB laptop) when you restart it.  I'd also try to run it from the command line just to take the Ambari View out of the picture, but if you want to run it in Ambari then kill any application via the RM UI that is around should it hang again.  Good luck and happy Hadooping!&lt;/P&gt;</description>
      <pubDate>Sun, 10 Dec 2017 04:45:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-To-Process-Data-with-Apache-Pig-tutorial-SLOW/m-p/129829#M18375</guid>
      <dc:creator>LesterMartin</dc:creator>
      <dc:date>2017-12-10T04:45:24Z</dc:date>
    </item>
  </channel>
</rss>

