Recently I tried to run a JOIN query using 'replicated' in PIG (HDP Sandbox env). It was running for long so I killed it manually from Hue. After doing so, none of the submitted jobs are running and getting stuck at 0% and remain in ACCEPTED state forever. Even Hive CLI is also not getting launched. Same is happening with Tez jobs as well. All services are running fine in Ambari. Please suggest what went wrong, how can I start using Sandbox environment like before . Thanks in advance !! Following is jps results .
3166 DataNode 2712 EmbeddedServer 4237 RunJar 6320 RunJar 4204 NodeManager 4592 RunJar 4263 ResourceManager 3013 AmbariServer 2737 Main 2701 ZeppelinServer 2975 UnixAuthenticationService 3810 3627 Bootstrap 12863 Jps 3721 Portmap 4026 RunJar 4216 ApplicationHistoryServer 3168 NameNode 3408 QuorumPeerMain
As very initial investigation steps, please chekc the system free memory.
Do you have enough Free Memory on your OS to allocate memory also the CPU for your Pig/Hive job?
Please check the output the output of following commands to see if it has enough free memory? ( If you see very less free memory available then try killing some processes that are consuming much memory... like for testing kill AmbariServer/Zeppelin ...etc)
# free -m
Also check if the CPU load is OK or not?
Issue is resolved.There was a missing FQDN mapping in /etc/hosts file because of which Job scheduling was not redirecting jobs to right node uri. Also permission rights on /tmp/ directory was granted according to the user and group .