Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

ambari-metrics-collector : resource_management.core.exceptions.ExecutionFailed , no space left on device

avatar
New Member

I'm trying to start the metrics collector , but getting a strange error instead:

resource_management.core.exceptions.ExecutionFailed: Execution of '/usr/sbin/ambari-metrics-collector --config /etc/ambari-metrics-collector/conf start' returned 1. Tue Oct 23 18:05:37 EDT 2018 Starting HBase.
starting master, logging to /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out
/usr/lib/ams-hbase/bin/hbase-daemon.sh: line 189: /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out: No space left on device
head: cannot open ‘/disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out’ for reading: No such file or directory
/usr/sbin/ambari-metrics-collector: line 81: /disks/disk1/run/ambari-metrics-collector/ambari-metrics-collector.pid: No space left on device
ERROR:  Cannot write pid /disks/disk1/run/ambari-metrics-collector/ambari-metrics-collector.pid.

it is complaining there is no space on device.

/disks/disk1/log/ambari-metrics-collector/ambari-metrics-collector.out shows the same thing:

Java HotSpot(TM) 64-Bit Server VM warning: Cannot open file /disks/disk1/log/ambari-metrics-collector/collector-gc.log-201810231817 due to No space left on device

log4j:ERROR setFile(null,true) call failed.

java.io.FileNotFoundException: /disks/disk1/log/ambari-metrics-collector/ambari-metrics-collector.log (No space left on device)

......

but

# df -h /disks/disk1/

Filesystem Size Used Avail Use% Mounted on

/dev/sdb1 4.9G 760M 4.2G 16% /disks/disk1

There is clearly some space there. How much space is really needed to write the output file?

Thanks,

Alex

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Alex Goron

Do you see any discrepancy between the output of the following commands? Can you please share the output?

# sudo du -sh /disks/disk1/
# sudo df -h


Are you able to make file on your own manually? (this is just to isolate the issue)

# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out
# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log
# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log


There might be possibility that the OS still might have some Deleted File Reserved by Processes that we can find out by running the following command.

# sudo lsof / | grep deleted


If you find some oprocesses aer still holding up "Deleted" files then restart those processes.

# sudo systemctl restart $SERVICE_NAME



There is a set of metadata on filesystems called “inodes.” Inodes track information about files.

# sudo df -i /


Also please check if there are any Bad Blocks

# sudo fsck -vcck /disks/disk1/


If None of the above helps the Please check if you can reboot the host where AMS is installed.

View solution in original post

2 REPLIES 2

avatar
Master Mentor
@Alex Goron

Do you see any discrepancy between the output of the following commands? Can you please share the output?

# sudo du -sh /disks/disk1/
# sudo df -h


Are you able to make file on your own manually? (this is just to isolate the issue)

# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.out
# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log
# echo "ABCDEFGH" > /disks/disk1/log/ambari-metrics-collector/hbase-ams-master-node4.hdp.com.log


There might be possibility that the OS still might have some Deleted File Reserved by Processes that we can find out by running the following command.

# sudo lsof / | grep deleted


If you find some oprocesses aer still holding up "Deleted" files then restart those processes.

# sudo systemctl restart $SERVICE_NAME



There is a set of metadata on filesystems called “inodes.” Inodes track information about files.

# sudo df -i /


Also please check if there are any Bad Blocks

# sudo fsck -vcck /disks/disk1/


If None of the above helps the Please check if you can reboot the host where AMS is installed.

avatar
New Member

@Jay Kumar SenSharma, Thanks a lot for helping me! Aparently I ran out of inodes. Not sure why it did not occur to me to check it first place ... Anyways reformatting the filesystem and a little bit of file shuffling exercise did the trick 🙂