Created 10-23-2018 10:44 PM
I'm trying to start the metrics collector , but getting a strange error instead:
resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector/conf start' returned 1. Tue Oct 23 18:05:37 EDT 2018 Starting HBase. starting master, logging to /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out /usr/lib/ams-hbase/bin/hbase-daemon.sh: line 189: /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out: No space left on device head: cannot open ‘/disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out’ for reading: No such file or directory /usr/sbin/ambari-metrics-collector: line 81: /disks/disk1/run/ambari-metrics-collector/ambari-metrics-collector.pid: No space left on device ERROR: Cannot write pid /disks/disk1/run/ambari-metrics-collector/ambari-metrics-collector.pid.
it is complaining there is no space on device.
/disks/disk1/log/ambari-metrics-collector/ambari-metrics-collector.out shows the same thing:
Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file /disks/disk1/log/ambari-metrics-collector/collector-gc.log-201810231817 due to No space left on device
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException: /disks/disk1/log/ambari-metrics-collector/ambari-metrics-collector.log (No space left on device)
......
but
# df -h /disks/disk1/
Filesystem Size Used Avail Use% Mounted on
/dev/sdb1 4.9G 760M 4.2G 16% /disks/disk1
There is clearly some space there. How much space is really needed to write the output file?
Thanks,
Alex
Created 10-23-2018 10:50 PM
Do you see any discrepancy between the output of the following commands? Can you please share the output?
# sudo du -sh /disks/disk1/ # sudo df -h
Are you able to make file on your own manually? (this is just to isolate the issue)
# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out # echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log # echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log
There might be possibility that the OS still might have some Deleted File Reserved by Processes that we can find out by running the following command.
# sudo lsof / | grep deleted
If you find some oprocesses aer still holding up "Deleted" files then restart those processes.
# sudo systemctl restart $SERVICE_NAME
There is a set of metadata on filesystems called “inodes.” Inodes track information about files.
# sudo df -i /
Also please check if there are any Bad Blocks
# sudo fsck -vcck /disks/disk1/
If None of the above helps the Please check if you can reboot the host where AMS is installed.
Created 10-23-2018 10:50 PM
Do you see any discrepancy between the output of the following commands? Can you please share the output?
# sudo du -sh /disks/disk1/ # sudo df -h
Are you able to make file on your own manually? (this is just to isolate the issue)
# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out # echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log # echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log
There might be possibility that the OS still might have some Deleted File Reserved by Processes that we can find out by running the following command.
# sudo lsof / | grep deleted
If you find some oprocesses aer still holding up "Deleted" files then restart those processes.
# sudo systemctl restart $SERVICE_NAME
There is a set of metadata on filesystems called “inodes.” Inodes track information about files.
# sudo df -i /
Also please check if there are any Bad Blocks
# sudo fsck -vcck /disks/disk1/
If None of the above helps the Please check if you can reboot the host where AMS is installed.
Created 10-26-2018 02:49 PM
@Jay Kumar SenSharma, Thanks a lot for helping me! Aparently I ran out of inodes. Not sure why it did not occur to me to check it first place ... Anyways reformatting the filesystem and a little bit of file shuffling exercise did the trick 🙂