Member since
05-06-2015
5
Posts
0
Kudos Received
0
Solutions
02-12-2018
11:04 PM
It is CDH 5.11.2. I have nearly 2 GB of rolled up logs and not a single FATAL message in there. Is there a way to force these messages? The way I understand that it has crashed is the service hadoop-hdfs-namenode status is FAILED and I need to restart the namenode manually - after which it works as if nothing was wrong.
... View more
02-12-2018
09:53 PM
We have a cluster NOT managed by Cloudera Manager (I wish, I could change that - but that is a different problem), HDFS has 1 nameservice in HA. Both namenodes crash periodically after a bunch of (80K+) - 2018-02-13 03:16:50,843 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of postponedMisreplicatedBlocks completed in 56 msecs. 8749610 blocks are left. 1 blocks are re moved. then It throws a thread dump - 2018-02-13 03:16:50,843 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of postponedMisreplicatedBlocks completed in 56 msecs. 8749610 blocks are left. 1 blocks are removed.
2018-02-13 03:16:51,512 INFO org.apache.hadoop.http.HttpServer2: Process Thread Dump: jsp requested
232 active threads
Thread 1143 (802274748@qtp-1636050357-2):
State: RUNNABLE
Blocked count: 10
emoved.
2018-02-13 03:16:44,732 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of postponedMisreplicatedBlocks completed in 47 msecs. 8749611 blocks are left. 0 blocks are re
moved.
2018-02-13 03:16:47,787 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of postponedMisreplicatedBlocks completed in 54 msecs. 8749611 blocks are left. 0 blocks are re
moved.
2018-02-13 03:16:50,843 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Rescan of postponedMisreplicatedBlocks completed in 56 msecs. 8749610 blocks are left. 1 blocks are re
moved.
2018-02-13 03:16:51,512 INFO org.apache.hadoop.http.HttpServer2: Process Thread Dump: jsp requested
232 active threads
Thread 1143 (802274748@qtp-1636050357-2):
State: RUNNABLE
Blocked count: 10
Waited count: 10
Stack:
sun.management.ThreadImpl.getThreadInfo1(Native Method)
sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:178)
sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:139)
org.apache.hadoop.util.ReflectionUtils.printThreadInfo(ReflectionUtils.java:165)
org.apache.hadoop.util.ReflectionUtils.logThreadInfo(ReflectionUtils.java:219)
org.apache.hadoop.http.HttpServer2$StackServlet.doGet(HttpServer2.java:1164)
javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
org.apache.hadoop.security.AuthenticationWithProxyUserFilter.doFilter(AuthenticationWithProxyUserFilter.java:96)
org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:574)
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1296)
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
Thread 1126 (RMI TCP Connection(56)-127.0.0.1):
State: RUNNABLE
Blocked count: 0
Waited count: 1
Stack:
java.net.SocketInputStream.socketRead0(Native Method)
java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
java.net.SocketInputStream.read(SocketInputStream.java:170)
java.net.SocketInputStream.read(SocketInputStream.java:141)
java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
java.io.BufferedInputStream.read(BufferedInputStream.java:265)
java.io.FilterInputStream.read(FilterInputStream.java:83)
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:550)
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$10/626277472.run(Unknown Source)
java.security.AccessController.doPrivileged(Native Method)
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) ... The same set of events happens on both the NameNodes in our HA setup and NameService crashes. Can people help me understand what is going on?
... View more
Labels:
- Labels:
-
HDFS
-
Manual Installation
03-15-2017
11:58 AM
I am in the same boat - we have a restrictive firewall in place and I am trying to open a range of ports 49900 - 50000 with the following in the mapred-site.xml. <property> <name>yarn.app.mapreduce.am.job.client.port-range</name> <value>49900-50000</value> </property> I am not able to restrict the ports at all - I see the following when I run my job - Got exception: java.net.NoRouteToHostException: No Route to Host from rm.domain.com/X.X.X.XXXX to workernodeX.domain.com:38470 - AFAIK - No route to host means the destination firewall is kicking me out. I am on cloudera CDH 5.10.0 - does it include the below fix forhttps://issues.apache.org/jira/browse/MAPREDUCE-6338? If not which version would - I need to run this thing with extensive firewall in place and hence the question. Thanks in advance for your help!
... View more