what is the requirements of passwordless SSH during the installation of Hadoop?
Passwordless SSH is not a requirement to setup a Hadoop Cluster.
Even when you use Ambari to manage your cluster the passwordless SSH setup is only needed when you want to install and setup the Ambari Agents remotely via ambari on remote hosts.
Ambari based hadoop cluster can also be setup without setting up passwordless SSH using manual Agent registration as described in the following doc: https://docs.hortonworks.com/HDPDocuments/Ambari-126.96.36.199/administering-ambari/content/amb_installing...
However if you want to know how to setup passwordless SSH between two / more hosts then please refer to the following doc: https://docs.hortonworks.com/HDPDocuments/Ambari-188.8.131.52/bk_ambari-installation-ppc/content/set_up_p...
Normal SSH gateway requires the password to be entered each time when a service tries to connect to any node. This will slow down the process to a great extent. Usually, Passwordless ssh is getting set up in distributed technology, where always node to node communication must be faster. As we know that Hadoop is fully distributed the technology. All data getting a store in multiple commodity hardware, so there must be faster communication with each other.
Actually, Hadoop is working on the Master-Slave architecture. When a client needs to store or access the data from HDFS, then its submit request to Master node then the master node distributing the request to multiple Slave nodes. Mean if we are not doing Passwordless ssh setup then for every client request Master will need to login slaves via credentials. Is this really feasible for faster data processing? Of course No.
That’s why we need the Passwordless ssh setup in Hadoop to make it feasible. Master will not need to log in the slaves, it will directly go to the slave address and will fetch or store the required data.