Support Questions
Find answers, ask questions, and share your expertise

Inter process communication between Hadoop nodes

Expert Contributor

Hi,

I would like to understand how inter process communication between hadoop nodes happen? I know that it uses Remote procedure call but would like to know if it needs passwordless ssh ? If it doesn't need passwordless ssh as well set up, then how does RPC work, especially when data is written on one of the datanodes and in turn that data node writes the same data(replication) to another datanode? How exactly this works when passwordless ssh is set up?

Thanks for your answers.

1 ACCEPTED SOLUTION

Accepted Solutions

Super Guru

@PJ hadoop heavily relies on being able to perform a forward and reverse lookup of the hostname. for intra node communicatation it uses tcp ip, more here https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html#The+Communication+Protocols

Therefore passwordless ssh is not require between nodes.

View solution in original post

1 REPLY 1

Super Guru

@PJ hadoop heavily relies on being able to perform a forward and reverse lookup of the hostname. for intra node communicatation it uses tcp ip, more here https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html#The+Communication+Protocols

Therefore passwordless ssh is not require between nodes.

View solution in original post