Support Questions

Find answers, ask questions, and share your expertise

Datanodes start and then stop ... only one datanode process working at a time

avatar
Contributor

hi

The Env details:

I installed a hadoop 2.7.2 (not HW but pure Hadoop) multinode cluster on AWS (1 Namenode/1 2nd NN/ 3 datanodes - ubuntu 14.04).

The cluster was based on the following tutorial(http://mfaizmzaki.com/2015/12/17/how-to-install-hadoop-2-7-1-multi-node-cluster-on-amazon-aws-ec2-instance-improved-part-1/) --> this means the first install (master) is copied and tuned across

The Issue:

The 3 data nodes individually work correctly if I configure the cluster with 1 Datanode (I specifically excluded the 2 others).

As soon as I add another data node the data node booting first log a FATAL error (see extract of the log file hereafter and snapshot of the VERSION file) and stop. The data node booting second work then fine...

  1. Any idea-recommendation ?
  2. Am I doing something wrong cloning the AMI of the master on other machine?

Thanks Folks!

Log File

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Unsuccessfully sent block report 0x1858458671b, containing 1 storage report(s), of which we sent 0. The reports had 0 total blocks and used 0 RPC(s). This took 5 msec to generate and 35 msecs for RPC and NN processing. Got back no commands.

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1251070591-172.Y.Y.Y-1454167071207 (Datanode Uuid 54bc8b80-b84f-4893-8b96-36568acc5d4b) service to master/172.Y.Y.Y:9000 is shutting down org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.UnregisteredNodeException): Data node DatanodeRegistration(172.X.X.X:50010, datanodeUuid=54bc8b80-b84f-4893-8b96-36568acc5d4b, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-8e09ff25-80fb-4834-878b-f23b3deb62d0;nsid=278157295;c=0) is attempting to report storage ID 54bc8b80-b84f-4893-8b96-36568acc5d4b. Node 172.Z.Z.Z:50010 is expected to serve this storage.

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-1251070591-172.31.34.94-1454167071207 (Datanode Uuid 54bc8b80-b84f-4893-8b96-36568acc5d4b) service to master/172.Y.Y.Y:9000

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-1251070591-172.Y.Y.Y-1454167071207 (Datanode Uuid 54bc8b80-b84f-4893-8b96-36568acc5d4b) INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-1251070591-172.31.34.94-1454167071207

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at HNDATA2/172.X.X.x ************************************************************/

1 ACCEPTED SOLUTION

avatar
Master Mentor
@luc tiber

1) Please use ambari

2) If you don't want to use ambari then :

Don;t clone the machine. Launch a new machine, install binaries and make it part of the cluster. If you are cloning then , Did you delete all the metadata or data from Hadoop directories?

View solution in original post

13 REPLIES 13

avatar
Master Mentor
@luc tiber

1) Please use ambari

2) If you don't want to use ambari then :

Don;t clone the machine. Launch a new machine, install binaries and make it part of the cluster. If you are cloning then , Did you delete all the metadata or data from Hadoop directories?

avatar
Master Mentor

@luc tiber See this

DatanodeRegistration(172.X.X.X:50010, datanodeUuid=54bc8b80-b84f-4893-8b96-36568acc5d4b, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-8e09ff25-80fb-4834-878b-f23b3deb62d0;nsid=278157295;c=0) is attempting to report storage ID 54bc8b80-b84f-4893-8b96-36568acc5d4b. Node 172.Z.Z.Z:50010 is expected to serve this storage.

avatar
Contributor
@Neeraj Sabharwal

Hi

At this stage I want to go the hard way... no Ambari nor HW (although this should be my next venture).

I want to learn bottom up 😉

Few clarification please:

  1. I will launch new instances for my datanodes but
    1. why is the 2NN working fine?
    2. What is the issue that make the DN crashes in flame - I see in your addendum that datanodeUuid has something to do with it but when is this ID created ?
  2. I did Not all the metadata or data from Hadoop directories. Should I try? if yes, I guess it is on the namenode and then I should format it again. Right?

Thanks!

avatar
Master Mentor

@luc tiber Bottom up approach :D..I agree as I learned that way too.

if you want to add node then you have try it clean. Hadoop will replicate the data once you add the node.

See this http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_Sys_Admin_Guides/content/ref-4303e343-9ae...

avatar

@luc tiber is there a pressing reason you are not using Hortonworks HDP? We're 100% open source and we provide a number of good installation guides. http://docs.hortonworks.com/HDPDocuments/Ambari/Ambari-2.2.0.0/index.html. The last time I installed directly from Apache was in 2012 and it was painful.

avatar
Master Mentor

amen to that @Scott Shaw try Hadoop and then HBase manual installed on top, and then try the same on Isilon w/out Ambari you're in a lot of pain. I would know 🙂

avatar
Contributor

No reason except the one mentioned earlier. Probably some kind of masochism ... I want to hit the wall at every step and climb over. When I will get this experience I will play with commercial distribution ...

avatar
Contributor

I learn from my mistake ;-))))

avatar

@luc tiber I will say you'll learn a lot! Also, at the end of it you'll better appreciate the ease of use provided by HDP. 🙂