Support Questions

Find answers, ask questions, and share your expertise

ambari-metrics-collector : resource_management.core.exceptions.ExecutionFailed , no space left on device

avatar
Contributor

I'm trying to start the metrics collector , but getting a strange error instead:

resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector/conf start' returned 1. Tue Oct 23 18:05:37 EDT 2018 Starting HBase.
starting master, logging to /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out
/usr/lib/ams-hbase/bin/hbase-daemon.sh: line 189: /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out: No space left on device
head: cannot open ‘/disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out’ for reading: No such file or directory
/usr/sbin/ambari-metrics-collector: line 81: /disks/disk1/run/ambari-metrics-collector/ambari-metrics-collector.pid: No space left on device
ERROR:  Cannot write pid /disks/disk1/run/ambari-metrics-collector/ambari-metrics-collector.pid.

it is complaining there is no space on device.

/disks/disk1/log/ambari-metrics-collector/ambari-metrics-collector.out shows the same thing:

Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file /disks/disk1/log/ambari-metrics-collector/collector-gc.log-201810231817 due to No space left on device

log4j:ERROR setFile(null,true) call failed.

java.io.FileNotFoundException: /disks/disk1/log/ambari-metrics-collector/ambari-metrics-collector.log (No space left on device)

......

but

# df -h /disks/disk1/

Filesystem Size Used Avail Use% Mounted on

/dev/sdb1 4.9G 760M 4.2G 16% /disks/disk1

There is clearly some space there. How much space is really needed to write the output file?

Thanks,

Alex

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Alex Goron

Do you see any discrepancy between the output of the following commands? Can you please share the output?

# sudo du -sh /disks/disk1/
# sudo df -h


Are you able to make file on your own manually? (this is just to isolate the issue)

# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out
# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log
# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log


There might be possibility that the OS still might have some Deleted File Reserved by Processes that we can find out by running the following command.

# sudo lsof / | grep deleted


If you find some oprocesses aer still holding up "Deleted" files then restart those processes.

# sudo systemctl restart $SERVICE_NAME



There is a set of metadata on filesystems called “inodes.” Inodes track information about files.

# sudo df -i /


Also please check if there are any Bad Blocks

# sudo fsck -vcck /disks/disk1/


If None of the above helps the Please check if you can reboot the host where AMS is installed.

View solution in original post

2 REPLIES 2

avatar
Master Mentor
@Alex Goron

Do you see any discrepancy between the output of the following commands? Can you please share the output?

# sudo du -sh /disks/disk1/
# sudo df -h


Are you able to make file on your own manually? (this is just to isolate the issue)

# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out
# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log
# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log


There might be possibility that the OS still might have some Deleted File Reserved by Processes that we can find out by running the following command.

# sudo lsof / | grep deleted


If you find some oprocesses aer still holding up "Deleted" files then restart those processes.

# sudo systemctl restart $SERVICE_NAME



There is a set of metadata on filesystems called “inodes.” Inodes track information about files.

# sudo df -i /


Also please check if there are any Bad Blocks

# sudo fsck -vcck /disks/disk1/


If None of the above helps the Please check if you can reboot the host where AMS is installed.

avatar
Contributor

@Jay Kumar SenSharma, Thanks a lot for helping me! Aparently I ran out of inodes. Not sure why it did not occur to me to check it first place ... Anyways reformatting the filesystem and a little bit of file shuffling exercise did the trick 🙂