To obtain maximum performance from a Hadoop cluster, it needs to be configured correctly. However, finding the ideal configuration for a Hadoop cluster is not easy. The best way to decide on the ideal configuration for the cluster is to run the Hadoop jobs with the default configuration available to get a baseline. After that, the job history log files can be analyzed to see if there is any resource weakness or if the time taken to run the jobs is higher than expected. Repeating the same process can help fine-tune the Hadoop cluster configuration in such a way that it best fits the business requirements.
All blocks of the cluster can be of the same size except the last block.