my Spark history server showing no completed applications found but when i check my /user/spark/applicationHistory, there are files there of the applications that are completed and inprogress :
[root@master01 ~]# hdfs dfs -ls /user/spark/applicationHistory
Found 853 items
-rwxrwx--- 3 x supergroup 54080 2020-05-14 14:01 /user/spark/applicationHistory/application_1589423756955_0001
-rwxrwx--- 3 x supergroup 88870 2020-05-14 14:22 /user/spark/applicationHistory/application_1589436102099_0001
-rwxrwx--- 3 x supergroup 87948 2020-05-14 14:32 /user/spark/applicationHistory/application_1589436102099_0002
-rwxrwx--- 3 x supergroup 95460 2020-05-14 18:05 /user/spark/applicationHistory/application_1589443387207_0001
-rwxrwx--- 3 x supergroup 92839 2020-05-14 18:24 /user/spark/applicationHistory/application_1589443387207_0002
-rwxrwx--- 3 x supergroup 159586 2020-05-15 17:16 /user/spark/applicationHistory/application_1589508854117_0043
-rwxrwx--- 3 x 218044 2020-05-15 21:10 /user/spark/applicationHistory/application_1589508854117_0051
-rw-r--r-- 3 x supergroup 0 2020-05-18 11:32 /user/spark/applicationHistory/application_1589508854117_0055.inprogress
but when I check my spark history server you will see these :
Hello @Mondi ,
Thank you for posting your query.
Spark History server would replay the logs as soon as it gets the files (eventlogs) in the configured HDFS path [ /user/spark/applicationHistory]. The replay operation just reads the Event logs from HDFS path and loads in to memory to make it available for rendering.
In your case, you have already confirmed that the file is present on the HDFS event logging directory. As a next step, could you please review the Spark History server logs and check if the replay operation is happening?
Also, there are chances that if the file/directory permissions of event logs are incorrect the replay operation would fail silently, In such scenarios, you might need to enable DEBUG level logs to review whats wrong with replay operations.
Hope this helps.
Hi @satz I've checked the spark history logs and it says that it has a read permission denied for the user named "spark". I've change recursively the /user/spark permission and ownership to spark but when there is a new file, it has its own permission type so it can't be read again by spark.