I have some strange issue going on with spark jobs.
Even hive on spark jobs.
Job seems to run successfully. While the job is running i can go thorough the Resource manager to application master which leads me to spark execution web UI.
But after the job finishes, even though the job is moved to Job history server, when i click on histroy server webui it doesnt take me to spark history web UI.
Instead the job remains under /tmp/logs/user/logs/applicationid
eg. drwxrwx--- - bigdata hadoop 0 2017-03-13 15:26 /tmp/logs/bigdata/logs/application_1489248168306_0076
drwxrwxrwt+ - mapred hadoop 0 2017-03-09 17:39 /tmp/logs
Permissions for /tmp is 1777
/user/bigdata is 755
drwxrwx---+ - mapred hadoop 0 2017-01-03 13:09 /user/history/done
drwxrwxrwt+ - mapred hadoop 0 2017-03-09 17:39 /user/history/done_intermediate
uid=489(mapred) gid=486(mapred) groups=486(mapred),493(hadoop)
uid=517(bigdata) gid=522(bigdata) groups=522(bigdata),528(hdpdev)
$ sudo -u hdfs hadoop fs -mkdir /user/spark $ sudo -u hdfs hadoop fs -mkdir /user/spark/applicationHistory $ sudo -u hdfs hadoop fs -chown -R spark:spark /user/spark $ sudo -u hdfs hadoop fs -chmod 1777 /user/spark/applicationHistory
Not sure whats going on here. Everything seems to be in order.
Surprisingly for some of the jobs i was able to be redirected from job history server to spark history server.
Figuered out the issue.
The issue was we were passing a spark.conf file while submitting the spark job hoping the config changes would be aggregated with default parameters from default spark.conf.
Turns out it overrides the default spark config file. Even if you pass blank spark conf it will not consider the default spark.conf for the job.
We had to below 3 lines on the custom spark conf file to enable log aggregation at spark history server and URL at resource manager to point to spark history server.
This has to be done with every spark job. If a job is submitted with below 3 parms it will not be available in spark history server even if u restart anything.