Member since
11-22-2017
273
Posts
4
Kudos Received
0
Solutions
12-14-2018
03:29 AM
"connection-pending remote" .. whatever the remote IP is, this system is not able to connect to it, masy be even to a specific port. This can happen if the remote system is busy or just the port to which the connection is being tried is busy or not open. You need to check your network settings and/or the service state that is running on the unresponsive port.
... View more
12-05-2018
07:56 PM
You can go to the link again and click on "+ new paste" for a new text field to post the logs. Once done, scroll below and click on "create new paste". A link will be generated. Share that link with us.
... View more
12-03-2018
07:38 PM
Editing my update: Could you please post the DN logs in a pastbing link.. https://pastebin.com/ We can have a look at them. The exceptions given in the description seem to be a consequence of an earlier problem and hence looking at the DN logs before the mentioned exceptions should help us clarify the problem. Also, grep your DN logs with "xceiverCount" or "exceeds the limit of concurrent xcievers" and post the results here.
... View more
12-01-2018
05:19 AM
I would still check with the developer as to why it fails the first time and not again. A certain paramter is being hit that we cannot determine from our end.
... View more
11-29-2018
11:21 PM
Exception in the log snippet shown is related to class "com.turn.platform.cheetah.storage.dmp.analytical_profile.merge.IncrementalProfileMergerMapper.close". Your DNs are aborting operation pointing to this class. This seems to be a custom 3rd party class. Kindly check with your vendor about this.
... View more
11-28-2018
06:21 PM
Lets start by fixing them one by one. 1. Start the ntpd service on all nodes to fix the clock offset problem if the service is not already started. If it is started, make sure that all the nodes refer to the same ntpd server 2. Check the space utilization for DNs that report "Free Space" issue. I would assume that you're reaching a certain threshold which is causing these alerts. 3. About agent status, could you show what the actual message is for this one? Alternatively, restart the cloudera-scm-agent service on the nodes that are hitting this alert and see if the alerts go away. 4. Post the exact message for Data Directory status. 5. Could you specify more about the frame errors, like exact message or a screenshot?
... View more
11-28-2018
01:15 AM
Check the disk status for the DataNode that is mentioned in the exception. Do you see any warning on your CM dashboard? If yes, can you post it?
... View more
11-22-2018
08:51 PM
We cannot be sure of the reasons for this message with the snippet that you have provided. If you notice, the connection is being successfuly set but there is not response from DN. ~~~ java.nio.channels.SocketChannel[connected local=/172.31.15.196:50010 remote=/172.31.1.81:57017] ~~~ It can happen due to various reasons, like, the pipeline is interrupted, there are network congestions at play, the DN disk is not performing well, DN host OS is having issues like kernel soft lockups or just that the DN is too heavily loaded to respond back. You'd have to dig in more into the logs and look for more information. See the messages logged before the exception you're getting in the DN logs.
... View more
05-10-2018
11:24 PM
1 Kudo
HDFS fsck only checks the files that are persisted on hdfs and not open files. Since you're seeing just one missing block in the UI warnings of CM and NN and no missing blocks in fsck output, this would indicate that the missing block alert is being generated from a file that is open in the memory and is most likely a false alarm. This should go away when the NN role is restarted or the cluster is restarted, probably during your next maintainance window.
... View more