- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Unable to see completed application in Spark 2 history web UI
- Labels:
-
Apache Spark
Created on 08-11-2018 06:50 AM - edited 09-16-2022 06:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Community,
Created 08-14-2018 01:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You may need to make sure the process owner of the Spark2 history server (by default it is spark user as well), belongs to the group "spark". So that the spark2 history server process would be able to read all the spark2 event log files.
You can check the process owner with " ps -ef |grep java| grep SPARK2" on the node where spark2 history server runs on.
Created on 08-14-2018 02:10 AM - edited 08-14-2018 02:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your response.
The /var/run/process and the ps -ef show that the user and the group is cloudera-scm
/var/run/cloudera-scm-agent/process
[root@serever process]# ll | grep Spark
[root@server process]# ll | grep SPARK
drwxr-x--x 7 cloudera-scm cloudera-scm 280 May 27 03:05 19175-spark_on_yarn-SPARK_YARN_HISTORY_SERVER
drwxr-x--x 8 cloudera-scm cloudera-scm 300 May 27 03:17 19240-spark2_on_yarn-SPARK2_YARN_HISTORY_SERVER
1829 cloudera 20 0 6682m 451m 33m S 0.3 0.4 379:36.38 /var/jdk8/bin/java -cp /var/run/cloudera-scm-agent/process/19240-spark2_on_yar
Also it's intersting for me why it's working for Spark 1.6 and not for Spark2.
It may also worth mentioning that my cluster is running with single user "cloudera-scm" as i'm using the cloudera manager in express version
Created 08-14-2018 11:11 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Okay, since the process owner is cloudera-scm, one way to fix the issue is adding cloudera-scm user to 'spark' group on all nodes.
Created 08-14-2018 11:37 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Yuexin Zhang Thanks for your response.
since i'm accessing it from the Spark History UI, i'm not sure if the UI is running with cloudera-scm user.
few things i'm trying to figure out which may hekp me to find a solution for this issue.
1- How it working on spark 1.6 differenty, and not in Spark 2, in Spark 1.6 the jobs under hdfs://name-node/user/spark/applicationhistory is written with the user cloudera-scm and group spark with permissions 770.
2- How i can know with which user the UI is pulling the data?
3- can i change the permission of the files under the hdfs spark history dir by adding specific config?
for example : something like spark.eventLog.permissions=755
Created 09-04-2019 10:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Check for the total no of applications in the Application history path, if the total no of files is more try to increase the heap size and look whether it works. Alternatively look for the spark history server logs too for any errors.
Thanks
AKR