- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HDFS /tmp filesystem is filling up rapidly and expected to cause outage
- Labels:
-
Apache Hive
-
Cloudera Manager
-
HDFS
Created on 08-01-2017 07:43 AM - edited 09-16-2022 05:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In our Hadoop cluster (Cloudera distribution), we recently found that Hive Job started by a user create a 160 TB of files in '/tmp' location and it almost consumed remaining HDFS space and about to cause an outage. Later we troubleshoot and kill the particular job as we are unable to reach the user who started this job.
So now my question is - how could we able to set an alert for '/tmp' location if anyone created huge files or can we restrict the users using HDFS '/tmp' space?
Please share if you have any other suggestions.
Created 08-01-2017 09:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are different options
1. If you have linux monitoring tools like Nagios, New Relic, ganglia, etc. You can set-up an alert for a file system (/tmp will be mounted on a file system) and trigger a mail if any file system running out of space
2. you can create a shell script to triger a mail based on the space availability and schedule via cron
Created 08-02-2017 07:08 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@saranvisa, Thanks for your reply.
I am talking about HDFS temp file system, not on the host machine temp file system.
Please advise.
Created 08-01-2017 10:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html
In CM you can configure alerts to notify you when disk and HDFS is nearing capacity.
Created 08-02-2017 07:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@mbigelow Thanks for your reply too.
"hdfs dfsadmin -setSpaceQuota" wont work on HDFS temp location. Need to find out some other alternative.
Created 08-02-2017 08:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Why won't it work? Have you tried /tmp and /tmp/hive/<user.name>.
The alternative if quotas can't be applied to /tmp or its subdirs is to set alerts for HDFS capacity or disk space on the disks hosting the DFS directories.
Created 12-31-2018 05:10 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Setting quota will work. Queries will fail with quota errors.