Support Questions

Find answers, ask questions, and share your expertise

This NodeManager is not connected to its ResourceManager.

avatar
Explorer

Every NodeManager in my hadoop cluster is not connected to its ResourceManager.

These are the errors I can see from yarn:

 

Thread Thread[Timer-2,5,main] threw an Exception.
java.lang.IllegalArgumentException: Wrong FS: hdfs://nameservice1:8020/user/history/done_intermediate/hive/job_1557996286771_33621_conf.xml, expected: hdfs://nameservice1
	at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:662)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:222)
	at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:114)
	at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1266)
	at org.apache.hadoop.hdfs.DistributedFileSystem$20.doCall(DistributedFileSystem.java:1262)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1262)
	at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1418)
	at org.apache.hadoop.fs.FileUtil.checkDest(FileUtil.java:499)
	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:351)
	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:341)
	at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:292)
	at org.apache.hadoop.mapreduce.v2.hs.KilledHistoryService$FlagFileHandler.copy(KilledHistoryService.java:210)
	at org.apache.hadoop.mapreduce.v2.hs.KilledHistoryService$FlagFileHandler.access$300(KilledHistoryService.java:85)
	at org.apache.hadoop.mapreduce.v2.hs.KilledHistoryService$FlagFileHandler$1.run(KilledHistoryService.java:138)
	at org.apache.hadoop.mapreduce.v2.hs.KilledHistoryService$FlagFileHandler$1.run(KilledHistoryService.java:125)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
	at org.apache.hadoop.mapreduce.v2.hs.KilledHistoryService$FlagFileHandler.run(KilledHistoryService.java:125)
	at java.util.TimerThread.mainLoop(Timer.java:555)
	at java.util.TimerThread.run(Timer.java:505)
View Log File
host4	ERROR	October 15, 2019 11:40 PM	NodeManager	
RECEIVED SIGNAL 15: SIGTERM
View Log File
master3	ERROR	October 15, 2019 11:40 PM	JobHistoryServer	
RECEIVED SIGNAL 15: SIGTERM
View Log File
master1	ERROR	October 15, 2019 11:40 PM	ResourceManager	
RECEIVED SIGNAL 15: SIGTERM
View Log File
host3	ERROR	October 15, 2019 11:40 PM	NodeManager	
RECEIVED SIGNAL 15: SIGTERM
View Log File
master1	ERROR	October 15, 2019 11:40 PM	AbstractDelegationTokenSecretManager	
ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
View Log File
master1	ERROR	October 15, 2019 11:40 PM	AbstractDelegationTokenSecretManager	
ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
View Log File
master1	ERROR	October 15, 2019 11:40 PM	AbstractDelegationTokenSecretManager	
ExpiredTokenRemover received java.lang.InterruptedException: sleep interrupted
View Log File
host2	ERROR	October 15, 2019 11:40 PM	NodeManager	
RECEIVED SIGNAL 15: SIGTERM

 

please any help? 

1 REPLY 1

avatar
Cloudera Employee

Hi,

 

From the Yarn logs we not able to see lot of " Sig term" Errors. Did you checked for the memory in the Yarn job? Added how do you found that the Node manager is not connected to the Resourcemanager? Could you share more information?

 

Also please share Node manager logs and Resource manager logs for further digging of this issue.

 

Thanks

AKR