Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Datanodes start and then stop ... only one datanode process working at a time

avatar
Contributor

hi

The Env details:

I installed a hadoop 2.7.2 (not HW but pure Hadoop) multinode cluster on AWS (1 Namenode/1 2nd NN/ 3 datanodes - ubuntu 14.04).

The cluster was based on the following tutorial(http://mfaizmzaki.com/2015/12/17/how-to-install-hadoop-2-7-1-multi-node-cluster-on-amazon-aws-ec2-instance-improved-part-1/) --> this means the first install (master) is copied and tuned across

The Issue:

The 3 data nodes individually work correctly if I configure the cluster with 1 Datanode (I specifically excluded the 2 others).

As soon as I add another data node the data node booting first log a FATAL error (see extract of the log file hereafter and snapshot of the VERSION file) and stop. The data node booting second work then fine...

  1. Any idea-recommendation ?
  2. Am I doing something wrong cloning the AMI of the master on other machine?

Thanks Folks!

Log File

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Unsuccessfully sent block report 0x1858458671b, containing 1 storage report(s), of which we sent 0. The reports had 0 total blocks and used 0 RPC(s). This took 5 msec to generate and 35 msecs for RPC and NN processing. Got back no commands.

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool BP-1251070591-172.Y.Y.Y-1454167071207 (Datanode Uuid 54bc8b80-b84f-4893-8b96-36568acc5d4b) service to master/172.Y.Y.Y:9000 is shutting down org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.UnregisteredNodeException): Data node DatanodeRegistration(172.X.X.X:50010, datanodeUuid=54bc8b80-b84f-4893-8b96-36568acc5d4b, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-8e09ff25-80fb-4834-878b-f23b3deb62d0;nsid=278157295;c=0) is attempting to report storage ID 54bc8b80-b84f-4893-8b96-36568acc5d4b. Node 172.Z.Z.Z:50010 is expected to serve this storage.

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool BP-1251070591-172.31.34.94-1454167071207 (Datanode Uuid 54bc8b80-b84f-4893-8b96-36568acc5d4b) service to master/172.Y.Y.Y:9000

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool BP-1251070591-172.Y.Y.Y-1454167071207 (Datanode Uuid 54bc8b80-b84f-4893-8b96-36568acc5d4b) INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Removing block pool BP-1251070591-172.31.34.94-1454167071207

WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0

INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at HNDATA2/172.X.X.x ************************************************************/

1 ACCEPTED SOLUTION

avatar
Master Mentor
@luc tiber

1) Please use ambari

2) If you don't want to use ambari then :

Don;t clone the machine. Launch a new machine, install binaries and make it part of the cluster. If you are cloning then , Did you delete all the metadata or data from Hadoop directories?

View solution in original post

13 REPLIES 13

avatar
Master Mentor
@luc tiber

1) Please use ambari

2) If you don't want to use ambari then :

Don;t clone the machine. Launch a new machine, install binaries and make it part of the cluster. If you are cloning then , Did you delete all the metadata or data from Hadoop directories?

avatar
Master Mentor

@luc tiber See this

DatanodeRegistration(172.X.X.X:50010, datanodeUuid=54bc8b80-b84f-4893-8b96-36568acc5d4b, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-56;cid=CID-8e09ff25-80fb-4834-878b-f23b3deb62d0;nsid=278157295;c=0) is attempting to report storage ID 54bc8b80-b84f-4893-8b96-36568acc5d4b. Node 172.Z.Z.Z:50010 is expected to serve this storage.

avatar
Contributor
@Neeraj Sabharwal

Hi

At this stage I want to go the hard way... no Ambari nor HW (although this should be my next venture).

I want to learn bottom up 😉

Few clarification please:

  1. I will launch new instances for my datanodes but
    1. why is the 2NN working fine?
    2. What is the issue that make the DN crashes in flame - I see in your addendum that datanodeUuid has something to do with it but when is this ID created ?
  2. I did Not all the metadata or data from Hadoop directories. Should I try? if yes, I guess it is on the namenode and then I should format it again. Right?

Thanks!

avatar
Master Mentor

@luc tiber Bottom up approach :D..I agree as I learned that way too.

if you want to add node then you have try it clean. Hadoop will replicate the data once you add the node.

See this http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_Sys_Admin_Guides/content/ref-4303e343-9ae...

avatar

@luc tiber is there a pressing reason you are not using Hortonworks HDP? We're 100% open source and we provide a number of good installation guides. http://docs.hortonworks.com/HDPDocuments/Ambari/Ambari-2.2.0.0/index.html. The last time I installed directly from Apache was in 2012 and it was painful.

avatar
Master Mentor

amen to that @Scott Shaw try Hadoop and then HBase manual installed on top, and then try the same on Isilon w/out Ambari you're in a lot of pain. I would know 🙂

avatar
Contributor

No reason except the one mentioned earlier. Probably some kind of masochism ... I want to hit the wall at every step and climb over. When I will get this experience I will play with commercial distribution ...

avatar
Contributor

I learn from my mistake ;-))))

avatar

@luc tiber I will say you'll learn a lot! Also, at the end of it you'll better appreciate the ease of use provided by HDP. 🙂