Support Questions
Find answers, ask questions, and share your expertise

Errors in Host health history

Expert Contributor

Hello,

I see these errors in health history from past 2 days. I restart cluster and it turns out fine after that but it's problematic since our jobs get killed until it's restarted.

Can I get some pointer on this, please? Memory and CPU configurations are fine though.

 

May 1 5:16:29 AM
  • Agent Status Good
Show
May 1 5:16:04 AM
  • Frame Errors Good
  • 1 Still Bad
Show
May 1 5:15:39 AM
  • 6 Became Good
  • 1 Became Disabled
  • 1 Still Bad
Show
May 1 4:51 AM
  • Frame Errors Unknown
  • 1 Still Bad
Show
May 1 4:49 AM
  • Swapping Unknown
  • 1 Still Bad
Show
May 1 4:37:51 AM
  • Network Interface Speed Unknown
  • 1 Still Bad
Show
May 1 4:37:46 AM
  • 1 Became Bad
  • 1 Became Unknown
Show

 

 

The only error that I see in agent logs  is:

 

Unable to retrieve remote parcel repository manifest
java.util.concurrent.ExecutionException: java.net.ConnectException: Connection refused to http://fqdn:14641/manifest.json
	at com.ning.http.client.providers.netty.NettyResponseFuture.abort(NettyResponseFuture.java:297)
	at com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:104)
	at org.jboss.netty.channel.DefaultChannelFuture.notifyListener(DefaultChannelFuture.java:399)
	at org.jboss.netty.channel.DefaultChannelFuture.notifyListeners(DefaultChannelFuture.java:390)
	at org.jboss.netty.channel.DefaultChannelFuture.setFailure(DefaultChannelFuture.java:352)
	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:409)
	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:366)
	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:282)
	at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
	at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused to http://FQDN:14641/manifest.json
	at com.ning.http.client.providers.netty.NettyConnectListener.operationComplete(NettyConnectListener.java:100)
	... 11 more
Caused by: java.net.ConnectException: Connection refused
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.connect(NioClientSocketPipelineSink.java:404)
	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.processSelectedKeys(NioClientSocketPipelineSink.java:366)
	at org.jboss.netty.channel.socket.nio.NioClientSocketPipelineSink$Boss.run(NioClientSocketPipelineSink.java:282)
	... 3 more

 

1 REPLY 1

Hi,

 

You should check with Telecom team or evaluate the switch port connected to this server network interface configuration, if the jumbo frame is enabled (MTU = 9000).

 

The same configuration must be verified in the network interface settings for the server.

 

A great way to check if the configuration is missing at one of the points is checking if there are dropped packages with the command 'ifconfig -a', executed in server SO console:

 

[root@<hostname> ~]# ifconfig -a
bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 9000
inet Ip.Ad.re.ss netmask net.m.as.k broadcast bro.d.ca.st
ether XX:XX:XX:XX:XX:XX txqueuelen 1000 (Ethernet)
RX packets 522849928 bytes 80049415915 (74.5 GiB)
RX errors 274721 dropped 276064 overruns 0 frame 274721
TX packets 520714273 bytes 72697966414 (67.7 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

 

In this case, jumbo frame is configured only in server network interface.

 

Regards,

Caseiro.

; ;