I am trying to run the pig script in your hello world series of tutorials but I keep getting error "Job Failed to Start"
When I open the stack trace I get below error:
"java.net.SocketTimeoutException: Read timed out"
Can someone please help me? What is the reason for this? I did everything exactly according to the tutorial
Can you please check if your resource manager is up and running?
Can you please check if you have connectivity from pig client to RM?
Firewall is off?
HDFS is up and running? May be try running sample hdfs command to read any file on hdfs?
As we don't have any stacktrace to check, I'm asking you basic things to check.
My apologies for not attaching the full stack trace from the beginning.
I have attached it now. I have also attached a screenshot of my HDFS and yarn services dashboards...
I am following the "Hello World" tutorial series and am currently busy with "Hadoop Tutorial – Getting Started with HDP" lab 3. I'm using a brand new sandbox HDP 2.4 VM.. Freshly imported into Virtualbox. 5.0... I have not made any alterations to it whatsoever. I only followed the steps in the "Hadoop Tutorial – Getting Started with HDP" tutorial.
I basically start up the VM...I then connect to 127.0.0.1:8888 for Sandbox Welcome screen and I open 127.0.0.1:8080 for Ambari from a Google Chrome browser in my host machine running Windows 7....
I log into ambari using "maria_dev" username and password...
Please let me know if you need any further information.
Your help is much appreciated.
I was also facing exactly the same problem. Even after making sure that Hive server and resource manager were up and running, the problem was still there.
After clearing the history, somehow the problem got resolved.
I am able to start the script, after doing restart of the server. I found during the failure cases the Ranger was showing in RED in the dashboard of the Ambari. After restart it is green and I am able to run the PIG script.
Hi there, folks.
As I was going through the tutorial text, I was under the impression that I had to run the script after setting the variable 'a', as is said in the tutorial:
a = LOAD 'geolocation' USING org.apache.hive.hcatalog.pig.HCatLoader();
That is wrong, though.
The script will only run after you set a pig parameter:
I was using the Pig View, in Ambari, when I executed this, as is also said by the tutorial.
If you want to run the first line of your script via the command line, you should remote-login to the sandbox and save a script with only the first line and run it using:
pig -useHCatalog -f <pathToYourScript>/riskfactor.pig
I solved this by increasing the timeouts in /etc/ambari-server/conf/ambari.properties
Especially the read timeouts..
You need to restart your ambari server after that.