Member since
02-10-2017
3
Posts
0
Kudos Received
0
Solutions
02-12-2017
03:40 PM
Jay SenSharma also said it could be a problem with the namenode so I posted the namenode.log where it says: "RetriableException: NameNode still not started"
https://community.hortonworks.com/comments/83058/view.html
... View more
02-12-2017
03:38 PM
Ambari says the namenode is running and healthy: So i restarted the namenode and wartched the log. There is an "RetriableException: NameNode still not started" 2017-02-12 16:24:58,249 INFO namenode.NameNode (LogAdapter.java:info(47)) - STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: user = hdfs
STARTUP_MSG: host = node0/127.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 2.7.3.2.5.3.0-37
STARTUP_MSG: classpath = ...
STARTUP_MSG: build = git@github.com:hortonworks/hadoop.git -r 9828acfdec41a121f0121f556b09e2d112259e92; compiled by 'j
enkins' on 2016-11-29T18:37Z
STARTUP_MSG: java = 1.8.0_77
************************************************************/
2017-02-12 16:24:58,268 INFO namenode.NameNode (LogAdapter.java:info(47)) - registered UNIX signal handlers for [TERM,
HUP, INT]
2017-02-12 16:24:58,271 INFO namenode.NameNode (NameNode.java:createNameNode(1600)) - createNameNode []
2017-02-12 16:24:58,709 INFO impl.MetricsConfig (MetricsConfig.java:loadFirst(112)) - loaded properties from hadoop-met
rics2.properties
...
2017-02-12 16:25:01,416 INFO ipc.Server (Server.java:run(1045)) - IPC Server Responder: starting
2017-02-12 16:25:01,417 INFO ipc.Server (Server.java:run(881)) - IPC Server listener on 8020: starting
2017-02-12 16:25:01,429 INFO namenode.NameNode (NameNode.java:startCommonServices(876)) - NameNode RPC up at: node0/127
.0.1.1:8020
2017-02-12 16:25:01,430 INFO namenode.FSNamesystem (FSNamesystem.java:startActiveServices(1130)) - Starting services re
quired for active state
2017-02-12 16:25:01,436 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(161)) - Starting
CacheReplicationMonitor with interval 30000 milliseconds
2017-02-12 16:25:03,040 INFO ipc.Server (Server.java:logException(2401)) - IPC Server handler 0 on 8020, call org.apach
e.hadoop.hdfs.server.protocol.DatanodeProtocol.sendHeartbeat from 127.0.0.1:42972 Call#687 Retry#0
org.apache.hadoop.ipc.RetriableException: NameNode still not started
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.checkNNStartup(NameNodeRpcServer.java:2057)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.sendHeartbeat(NameNodeRpcServer.java:1414)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.sendHeartbeat(DatanodeProtocolServer
SideTranslatorPB.java:118)
at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(Dat
anodeProtocolProtos.java:29064)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)
2017-02-12 16:25:03,080 INFO fs.TrashPolicyDefault (TrashPolicyDefault.java:<init>(224)) - The configured checkpoint in
terval is 0 minutes. Using an interval of 360 minutes that is used for deletion instead
...
I cannot poste the complete log so you can find it here: complete log
... View more
02-10-2017
11:12 AM
I installed Hadoop (HDP 2.5.3) on 4 VMs with Ambari (1 Ambari Server and 3 Ambari Clients; with the DNS entries server, node0, node1, node2) with HDFS, YARN, MapReduce and Zookeeper. However, YARN doesn't want to start. When starting the Resource Manager on node1 I get the following error: <code>resource_management.core.exceptions.ExecutionFailed: Execution of 'curl -sS -L -w '%{http_code}' -X GET 'http://node0:50070/webhdfs/v1/ats/done/?op=GETFILESTATUS&user.name=hdfs' 1>/tmp/tmpgsiRLj 2>/tmp/tmpMENUFa' returned 7. curl: (7) Failed to connect to node0 port 50070: connection refused 000
App Timeline Server and History Server on node1 don't want to start either. Zookeeper, NameNode, DataNode and Nodemanager on Node0 is up. The nodes can reach each other (tried with ping, tested via ip and via dns-names) so that shouldn't be the problem. Hopefully one can help me. I'm really new to this topic and not really familiar with the system.
... View more
Labels: