<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: &amp;quot;Exception in doCheckpoint ; java.net.SocketTimeoutException: Read timed out&amp;quot;. in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/quot-Exception-in-doCheckpoint-java-net/m-p/12978#M1848</link>
    <description>&lt;P&gt;Hi,&amp;nbsp;&lt;SPAN style="line-height: 14px;"&gt;&amp;nbsp;As per HARSH.J suugestion,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;i added more HEAP size (5 GB to 8 GB) to Name Node &amp;amp; Secondary Name &amp;nbsp;Node.&lt;/P&gt;&lt;P&gt;issue resolved !!!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you HARSH.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Best Regards,&lt;BR /&gt;Bommuraj&lt;/P&gt;</description>
    <pubDate>Thu, 29 May 2014 17:17:42 GMT</pubDate>
    <dc:creator>Bommuraj Paramaraj</dc:creator>
    <dc:date>2014-05-29T17:17:42Z</dc:date>
    <item>
      <title>"Exception in doCheckpoint ; java.net.SocketTimeoutException: Read timed out".</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/quot-Exception-in-doCheckpoint-java-net/m-p/12846#M1847</link>
      <description>&lt;P&gt;Dear Folks, i am seeing "strange" issue with my secondary name node and Checking point is not happening as expected way.&lt;BR /&gt;Chekckpoint occurs an every hour But For Past few days, Its not happening hourly, Means, Its Delayed to 8-12 hours.&lt;BR /&gt;When i check the secondary name node log , I found this "Exception in doCheckpoint ; java.net.SocketTimeoutException: Read timed out".&lt;BR /&gt;&lt;BR /&gt;I checked Name Node Resource utilization , I did not see any issue, there are plenty of resources.&lt;BR /&gt;But In Secondary name-node , I am seeing CPU utilization is 100% However there 80% idle CPU. (these are enterprise hardware , it has 8 CPU's core)&lt;BR /&gt;&lt;BR /&gt;I am suspecting , This issue due to Massive RPC However do we have any utility to Measure RPC in Name Node ?&lt;BR /&gt;is there anyway to find what causing this delay in check point ?&lt;BR /&gt;Also I am seeing "Name Node &amp;amp; Secondary Name Node" Becomes BAD health very frequently in CM by giving "Cloudera Manager agent is not able to communicate with this role's web server."&lt;BR /&gt;(we recently configured FLUME, i am suspecting that would cause the Issue However I am not seeing any abnormal behavior in Name Node)&lt;BR /&gt;&lt;BR /&gt;NOTE: these issues started to visible only for past 5 days, We have running this cluster more than year , CDH 4.1.1 &amp;amp; CM Cloudera Enterprise 4.6.3 )&lt;BR /&gt;&lt;BR /&gt;Best Regards,&lt;BR /&gt;Bommuraj&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;1:31:51.953 PM INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader&lt;BR /&gt;replaying edit log: 582444772/110965 transactions completed. (524891%)&lt;BR /&gt;11:32:35.978 PM INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader&lt;BR /&gt;replaying edit log: 582449593/110965 transactions completed. (524895%)&lt;BR /&gt;11:32:36.325 PM INFO org.apache.hadoop.hdfs.server.namenode.FSImage&lt;BR /&gt;Edits file /mnt/sda/dfs/snn/current/edits_0000000000582340622-0000000000582451586 of size 14029328 edits # 110965 loaded in 1059 seconds.&lt;BR /&gt;11:33:20.847 PM INFO org.apache.hadoop.hdfs.server.namenode.FSImage&lt;BR /&gt;Saving image file /mnt/sda/dfs/snn/current/fsimage.ckpt_0000000000582451586 using no compression&lt;BR /&gt;5:20:28.285 AM INFO org.apache.hadoop.hdfs.server.namenode.FSImage&lt;BR /&gt;Image file of size 2034249037 saved in 20827 seconds.&lt;BR /&gt;5:20:28.328 AM INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager&lt;BR /&gt;Going to retain 2 images with txid &amp;gt;= 582122380&lt;BR /&gt;5:20:28.328 AM INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager&lt;BR /&gt;Purging old image FSImageFile(file=/mnt/sda/dfs/snn/current/fsimage_0000000000582003081, cpktTxId=0000000000582003081)&lt;BR /&gt;5:20:28.741 AM INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager&lt;BR /&gt;Purging old edit log EditLogFile(file=/mnt/sda/dfs/snn/current/edits_0000000000580991759-0000000000581050238,first=0000000000580991759,last=0000000000581050238,inProgress=false,hasCorruptHeader=false)&lt;BR /&gt;5:20:28.744 AM INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager&lt;BR /&gt;Purging old edit log EditLogFile(file=/mnt/sda/dfs/snn/current/edits_0000000000581050239-0000000000581116511,first=0000000000581050239,last=0000000000581116511,inProgress=false,hasCorruptHeader=false)&lt;BR /&gt;5:20:28.769 AM INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage&lt;BR /&gt;Opening connection to &lt;A target="_blank" href="http://stats-2409.intranet.bit:50070/getimage?putimage=1&amp;amp;txid=582451586&amp;amp;port=50090&amp;amp;storageInfo=-40:907664250:1351227679575:CID-3cce1556-c0f0-4517-bbba-c2b62b256d44"&gt;http://stats-2409.intranet.bit:50070/getimage?putimage=1&amp;amp;txid=582451586&amp;amp;port=50090&amp;amp;storageInfo=-40:907664250:1351227679575:CID-3cce1556-c0f0-4517-bbba-c2b62b256d44&lt;/A&gt;&lt;BR /&gt;5:21:28.802 AM ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode&lt;BR /&gt;E&lt;FONT color="#FF0000"&gt;xception in doCheckpoint&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;java.net.SocketTimeoutException: Read timed out&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at java.net.SocketInputStream.socketRead0(Native Method)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at java.net.SocketInputStream.read(SocketInputStream.java:129)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at java.io.BufferedInputStream.read(BufferedInputStream.java:317)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at sun.net.&lt;A target="_blank" href="http://www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)"&gt;www.http.HttpClient.parseHTTPHeader(HttpClient.java:687)&lt;/A&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at sun.net.&lt;A target="_blank" href="http://www.http.HttpClient.parseHTTP(HttpClient.java:632)"&gt;www.http.HttpClient.parseHTTP(HttpClient.java:632)&lt;/A&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at sun.net.&lt;A target="_blank" href="http://www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)"&gt;www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)&lt;/A&gt;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.doGetUrl(TransferFsImage.java:244)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.getFileClient(TransferFsImage.java:222)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at org.apache.hadoop.hdfs.server.namenode.TransferFsImage.uploadImageFromStorage(TransferFsImage.java:137)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:474)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:331)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:298)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:452)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:294)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;at java.lang.Thread.run(Thread.java:662)&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;5:23:33.050 AM INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode &lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;Image has not changed. Will not download image.&lt;/FONT&gt;&lt;BR /&gt;5:23:33.051 AM INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage&lt;BR /&gt;Opening connection to &lt;A target="_blank" href="http://stats-2409.intranet.bit:50070/getimage?getedit=1&amp;amp;startTxId=582451587&amp;amp;endTxId=582519424&amp;amp;storageInfo=-40:907664250:1351227679575:CID-3cce1556-c0f0-4517-bbba-c2b62b256d44"&gt;http://stats-2409.intranet.bit:50070/getimage?getedit=1&amp;amp;startTxId=582451587&amp;amp;endTxId=582519424&amp;amp;storageInfo=-40:907664250:1351227679575:CID-3cce1556-c0f0-4517-bbba-c2b62b256d44&lt;/A&gt;&lt;BR /&gt;5:23:33.348 AM INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage&lt;BR /&gt;Transfer took 0.30s at 26855.22 KB/s&lt;BR /&gt;5:23:33.348 AM INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage&lt;BR /&gt;Downloaded file edits_0000000000582451587-0000000000582519424 size 8167790 bytes.&lt;BR /&gt;5:23:33.349 AM INFO org.apache.hadoop.hdfs.server.namenode.Checkpointer&lt;BR /&gt;Checkpointer about to load edits from 1 stream(s).&lt;BR /&gt;5:23:33.349 AM INFO org.apache.hadoop.hdfs.server.namenode.FSImage&lt;BR /&gt;Reading /mnt/sda/dfs/snn/current/edits_0000000000582451587-0000000000582519424 expecting start txid #582451587&lt;BR /&gt;5:24:11.819 AM INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader&lt;BR /&gt;replaying edit log: 582454371/67838 transactions completed. (858596%)&lt;/P&gt;</description>
      <pubDate>Tue, 27 May 2014 17:58:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/quot-Exception-in-doCheckpoint-java-net/m-p/12846#M1847</guid>
      <dc:creator>Bommuraj Paramaraj</dc:creator>
      <dc:date>2014-05-27T17:58:21Z</dc:date>
    </item>
    <item>
      <title>Re: "Exception in doCheckpoint ; java.net.SocketTimeoutException: Read timed out".</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/quot-Exception-in-doCheckpoint-java-net/m-p/12978#M1848</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;SPAN style="line-height: 14px;"&gt;&amp;nbsp;As per HARSH.J suugestion,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;i added more HEAP size (5 GB to 8 GB) to Name Node &amp;amp; Secondary Name &amp;nbsp;Node.&lt;/P&gt;&lt;P&gt;issue resolved !!!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you HARSH.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Best Regards,&lt;BR /&gt;Bommuraj&lt;/P&gt;</description>
      <pubDate>Thu, 29 May 2014 17:17:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/quot-Exception-in-doCheckpoint-java-net/m-p/12978#M1848</guid>
      <dc:creator>Bommuraj Paramaraj</dc:creator>
      <dc:date>2014-05-29T17:17:42Z</dc:date>
    </item>
  </channel>
</rss>

