About ramanhopes

ramanhopes · ‎12-14-2018

"connection-pending remote" .. whatever the remote IP is, this system is not able to connect to it, masy be even to a specific port. This can happen if the remote system is busy or just the port to which the connection is being tried is busy or not open. You need to check your network settings and/or the service state that is running on the unresponsive port.

ramanhopes · ‎12-05-2018

You can go to the link again and click on "+ new paste" for a new text field to post the logs. Once done, scroll below and click on "create new paste". A link will be generated. Share that link with us.

ramanhopes · ‎12-03-2018

Editing my update: Could you please post the DN logs in a pastbing link.. https://pastebin.com/ We can have a look at them. The exceptions given in the description seem to be a consequence of an earlier problem and hence looking at the DN logs before the mentioned exceptions should help us clarify the problem. Also, grep your DN logs with "xceiverCount" or "exceeds the limit of concurrent xcievers" and post the results here.

ramanhopes · ‎12-01-2018

I would still check with the developer as to why it fails the first time and not again. A certain paramter is being hit that we cannot determine from our end.

ramanhopes · ‎11-29-2018

Exception in the log snippet shown is related to class "com.turn.platform.cheetah.storage.dmp.analytical_profile.merge.IncrementalProfileMergerMapper.close". Your DNs are aborting operation pointing to this class. This seems to be a custom 3rd party class. Kindly check with your vendor about this.

ramanhopes · ‎11-28-2018

Lets start by fixing them one by one. 1. Start the ntpd service on all nodes to fix the clock offset problem if the service is not already started. If it is started, make sure that all the nodes refer to the same ntpd server 2. Check the space utilization for DNs that report "Free Space" issue. I would assume that you're reaching a certain threshold which is causing these alerts. 3. About agent status, could you show what the actual message is for this one? Alternatively, restart the cloudera-scm-agent service on the nodes that are hitting this alert and see if the alerts go away. 4. Post the exact message for Data Directory status. 5. Could you specify more about the frame errors, like exact message or a screenshot?

ramanhopes · ‎11-28-2018

Check the disk status for the DataNode that is mentioned in the exception. Do you see any warning on your CM dashboard? If yes, can you post it?

ramanhopes · ‎11-22-2018

We cannot be sure of the reasons for this message with the snippet that you have provided. If you notice, the connection is being successfuly set but there is not response from DN. ~~~ java.nio.channels.SocketChannel[connected local=/172.31.15.196:50010 remote=/172.31.1.81:57017] ~~~ It can happen due to various reasons, like, the pipeline is interrupted, there are network congestions at play, the DN disk is not performing well, DN host OS is having issues like kernel soft lockups or just that the DN is too heavily loaded to respond back. You'd have to dig in more into the logs and look for more information. See the messages logged before the exception you're getting in the DN logs.

ramanhopes · ‎05-10-2018

HDFS fsck only checks the files that are persisted on hdfs and not open files. Since you're seeing just one missing block in the UI warnings of CM and NN and no missing blocks in fsck output, this would indicate that the missing block alert is being generated from a file that is open in the memory and is most likely a false alarm. This should go away when the NN role is restarted or the cluster is restarted, probably during your next maintainance window.

Online	Offline
Last Visited	‎11-29-2024 03:49 AM

Member Since	‎11-22-2017 08:05 PM
Last Visited	‎11-29-2024 03:49 AM
Posts	278
Kudos received	2

Cloudera Community

Re: connection timeout

Re: IOException All datanodes DatanodeInfoWithSto...

Re: IOException All datanodes DatanodeInfoWithSto...

Re: IOException All datanodes DatanodeInfoWithSto...

Re: IOException All datanodes DatanodeInfoWithSto...

Re: IOException All datanodes DatanodeInfoWithSto...

Re: IOException All datanodes DatanodeInfoWithSto...

Re: Datanode socket timeout setting

Re: HDFS - Missing Blocks Inconsistent