11-14-2013 08:17 AM
I have an idea of capturing hadoop logs, parse and store the values like cpu used per job, memory used per job, number of processes used, number of threads it spawned, job run time...etc., in HDFS/Hbase and put a web interface on it. This provides the historical resources usage for Hadoop cluster over time......Comparing the corrent job run time with the same job run time in past gives you the performance metrics...
This is just an idea I am thinking about.....Please let me know your ideas on this. Thanks
11-14-2013 12:31 PM
11-15-2013 01:22 PM