Support Questions
Find answers, ask questions, and share your expertise

how to delete all application logs from spark history + not by rotation !!

from

hdfs dfs -du -h /

we see that spark history take a lot space from HDFS

from ambari GUI

I choose spark

and then quick links

and then I get the history server page with all applications

42812-capture.png

I want to delete all applications from the page

how to do it because I not see the delete button ?

second

is it possible to delete the application that use hdfs by API or CLI ?

Michael-Bronson
1 ACCEPTED SOLUTION

Accepted Solutions

@Michael Bronson,

If you want to delete applications in spark2

hdfs dfs -rm -R /spark2-history/{app-id}

If you want to delete applications in spark1

hdfs dfs -rm -R /spark-history/{app-id}

Restart history servers after running the commands.

Thanks,

Aditya

View solution in original post

14 REPLIES 14

Cloudera Employee

Hi @Michael Bronson

Spark has a bunch of parameter to deal with job history rotation.
In particular :

spark.history.fs.cleaner.enabled true
spark.history.fs.cleaner.maxAge  12h
spark.history.fs.cleaner.interval 1h

Source : https://spark.apache.org/docs/latest/monitoring.html

In the example above :
- Rotation is active
- All Jobs > than 12 hours will be deleted
- Deletion happens at 1 hour intervals

Note these parameters need to be implemented on a environnement level ( not on a job level ).
They are usually placed in spark-default file.

Matthieu

but I want to delete all application now ! not to wait for the retention
Michael-Bronson

@Michael Bronson,

If you want to delete applications in spark2

hdfs dfs -rm -R /spark2-history/{app-id}

If you want to delete applications in spark1

hdfs dfs -rm -R /spark-history/{app-id}

Restart history servers after running the commands.

Thanks,

Aditya

View solution in original post

@Aditya thank you -

but how to delete all application that use HDFS , because in the page I see a lot of application around 1000 , so I cant delete one by one

Michael-Bronson

hdfs dfs -rm -R /spark2-history/* will remove all applications

42813-capture.png

ok , so if I want to remove it from the ambari GUI then , how to do it ( I ask because from the page I not see any delete option )

Michael-Bronson

@Michael Bronson

You can use files view if you want to delete from GUI. I'm not sure if there is delete option in Spark history server.

is it possible to print by CLI all application list so I will by grep capture the hdfs and appliaction ID and then remove it by hdfs dfs -rm -R /spark2-history//{app-id}

Michael-Bronson

@Michael Bronson,

If you want to list all and delete all applications. You can simply do

hdfs dfs -rm -R /spark2-history/*

This folder will have only spark2 app logs and no other files. Hope this helps

(Or)

You can do the below. This should print all application IDs

curl http://{spark2history server url}:18080/api/v1/applications | grep "\"id\"" > a.txt
cut -d':' -f2 a.txt  | cut -d "\"" -f 2