Support Questions
Find answers, ask questions, and share your expertise

HDFS uptime and down time

HDFS uptime and down time

Contributor

how can i find the HDFS uptime and downtime for the last 6 months.

i found only current uptime of HDFS(namenode) on ambari UI .as soon as i restarted the service the uptime is caluculated from there onwards.

can you please tell me how to find the uptime and downtime of a HDFS from last 6 months

Thanks in advance

1 REPLY 1

Re: HDFS uptime and down time

Hi @kanna k !
AFAIK, there isn't a fashion way to do this. But you've some options to do this (most of them involve capturing the NN state or the PID and put in a scheduler like cron and control-m or powerful tools like airflow, luigi, and so on..) :

1 - Through Ambari Rest API:

curl -u admin -H "X-Requested-By: ambari" -X GET http://<AMBARI HOST>:8080/api/v1/clusters/<cluster name>/services/HDFS

You'll need to trigger this curl after every NN crash.
Docs > https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md

2 - Through HDFS Namenode JMX

curl -X GET http://NN_HOST:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus

Again you'll need to trigger this curl, so you can collect the whole historical status for NN.

3 - Through the old-fashion mode, by grabbing the PID after every NN reborn

ps -ef | grep -i namenode | grep -v grep | awk '{print $7}'

4 - Calculating the time between each namenode.out file

Since after restart, hdfs generates a .out file. I made a simple script to calculate this:

#!/bin/bash
IFS=$'\n'
times=$(stat -c'%n %Y' /var/log/hadoop/hdfs/*namenode*.out*)
for fall in $times 
do 
	diff_between_todayXfile=$(expr $(date +%s) - $(awk '{print $2}' <<< "$fall") )
	echo "File: $(awk '{print $1}' <<< "$fall" | xargs basename) days passed: $( expr $diff_between_todayXfile / 86400 )" 
done


5 - Use ambari DB alerts

#First check which DB are you using
cat /etc/ambari-server/conf/ambari.properties | grep -i jdbc

#Postgresql case, if this is your case then..
su - postgres
psql -U postgres -d ambari

#List all tables from ambari schema
ambari=# \dt ambari.*

#Query to show failed NN process
select 
	* 
from ambari.alert_history ah
inner join ambari.alert_definition ad on 
	ad.definition_id = ah.alert_definition_id
where 
	ah.service_name like 'HDFS' 
	and 
	ah.component_name like 'NAMENODE' 
	and 
	ad.label like 'NameNode High Availability Health' 
order by ah.alert_id desc
limit 10;

Hope this helps!