Support Questions

Find answers, ask questions, and share your expertise

History server dont start after Cluster installation

avatar
Rising Star

Dear All,

I have setup a new HDP cluster with 2.6.2.0 version and few services are not starting due to below errors. This is a new setup.

History Server Error

"

raise WebHDFSCallException(err_msg, result_dict)
resource_management.libraries.providers.hdfs_resource.WebHDFSCallException: Execution of 'curl -sS -L -w '%{http_code}' -X PUT --data-binary @/usr/hdp/2.6.2.0-205/hadoop/mapreduce.tar.gz -H 'Content-Type: application/octet-stream' 'http://ip-172-29-1-250.ap-southeast-1.compute.internal:50070/webhdfs/v1/hdp/apps/2.6.2.0-205/mapreduce/mapreduce.tar.gz?op=CREATE&user.name=hdfs&overwrite=True&permission=444'' returned status_code=403.
{
"RemoteException": {
"exception": "IOException",
"javaClassName": "java.io.IOException",
"message": "Failed to find datanode, suggest to check cluster health. excludeDatanodes=null"
}
}

"

NOTE: Datanode services are started and running fine, /etc/hosts are fine and hostname -f resolves the correct name.

I tried to run HDFS service check and ended up with the same error.

"

resource_management.libraries.providers.hdfs_resource.WebHDFSCallException: Execution of 'curl -sS -L -w '%{http_code}' -X PUT --data-binary @/etc/passwd -H 'Content-Type: application/octet-stream' 'http://ip-172-29-1-250.ap-southeast-1.compute.internal:50070/webhdfs/v1/tmp/id1dacfa01_date571418?op=CREATE&user.name=hdfs&overwrite=True'' returned status_code=403.
{
"RemoteException": {
"exception": "IOException",
"javaClassName": "java.io.IOException",
"message": "Failed to find datanode, suggest to check cluster health. excludeDatanodes=null"
}
}

"

Ambari Metrics Collector and Resource manager are getting started and randomly coming down in some mins.

Appreciate your help.


1 ACCEPTED SOLUTION

avatar
Rising Star

Hi All,

I'm able to fix the issue, you need to keep open ports 0-65535 on AWS security group side to communicate between nodes. This solved my problem. Thanks.

View solution in original post

1 REPLY 1

avatar
Rising Star

Hi All,

I'm able to fix the issue, you need to keep open ports 0-65535 on AWS security group side to communicate between nodes. This solved my problem. Thanks.