Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HI .. any one help me to understand what is /user/username/.staging directory ,it reaches 4tb can i delete it?

avatar
Expert Contributor

I have noticed that in dfs file system /user/username/.staging directory reaches 4tb..and directory having old file

15160634 2016-02-09 09:30 /user/userprod/.staging/job_1443521267046_99999/job.jar /job_1443521267046_99999/job.split .staging/job_1443521267046_99999/job.splitme /.staging/job_1443521267046_99999/job.xml /.staging/job_1443521267046_99999/libjars /.staging/job_1443521267046_99999/tez-conf.pb /.staging/job_1443521267046_99999/tez-dag.pb. /.staging/job_1443521267046_99999/tez.session

Can i remove this data?

1 ACCEPTED SOLUTION

avatar
Super Guru

@rama

Is the following value set to true?

keep.failed.task.files (MRv1) or mapreduce.task.files.preserve.failedtasks (MRv2).

If yes, that could be the reason staging files are not being deleted. Set this to false and delete the files manually. Do not delete files for currently running job.

In rare instances, due to job failure your staging files may not be deleted and you might its remnants here. These are just temporary map reduce files. If no current job is running, you can safely delete these files and reclaim the space. Make sure when you delete these files, they don't end up in trash folder (use -skipTrash option or later delete from trash folder also).

View solution in original post

2 REPLIES 2

avatar
Super Guru

@rama

Is the following value set to true?

keep.failed.task.files (MRv1) or mapreduce.task.files.preserve.failedtasks (MRv2).

If yes, that could be the reason staging files are not being deleted. Set this to false and delete the files manually. Do not delete files for currently running job.

In rare instances, due to job failure your staging files may not be deleted and you might its remnants here. These are just temporary map reduce files. If no current job is running, you can safely delete these files and reclaim the space. Make sure when you delete these files, they don't end up in trash folder (use -skipTrash option or later delete from trash folder also).

avatar
Expert Contributor

Thanks you so much @mqureshi

I could not find mapreduce.task.files.preserve.failedtasks , i am using MRv2 HDP 2.1.3 and currently i dont have running jobs..