Reply
Highlighted
New Contributor
Posts: 5
Registered: ‎07-23-2015

solrcloud shards going recovery mode after 1 billion indexed data (index size 110gb )

[ Edited ]
  • Hi I am having two datanodes and one namenode.(servers has around 260gb ram)
  • i created two shards.
  • i have indexed arroung 1 billion data (index size 110gb)
  • when index grows above mentioned amount i am getting below issues.

cloudera manager shows "web server live status " critical and "api livenes" is critical (Red in cloudera manager)

Also one shard goes "recovering status " as per solr admin UI .

Also giving "no servers hosting shard:" is hue 

 

Note that i used flume with morpline for indexing to solr (Solr Sink Using Morphlines)

 

Below are some error logs i captured

 

 

 

016-03-13 07:37:40,356 ERROR org.apache.solr.servlet.SolrDispatchFilter: null:ClientAbortException:  java.net.SocketException: Broken pipe

2016-03-13 07:37:40,362 ERROR org.apache.solr.servlet.SolrDispatchFilter: null:ClientAbortException:  java.net.SocketException: Broken pipe

2016-03-13 07:37:40,364 ERROR org.apache.solr.servlet.SolrDispatchFilter: null:ClientAbortException:  java.net.SocketException: Broken pipe

2016-03-13 07:43:13,449 ERROR org.apache.solr.servlet.SolrDispatchFilter: null:ClientAbortException:  java.net.SocketException: Broken pipe

2016-03-13 07:43:13,449 ERROR org.apache.solr.servlet.SolrDispatchFilter: null:ClientAbortException:  java.net.SocketException: Broken pipe

2016-03-13 07:43:13,450 ERROR org.apache.solr.servlet.SolrDispatchFilter: null:ClientAbortException:  java.net.SocketException: Broken pipe

2016-03-13 07:43:13,455 ERROR org.apache.solr.servlet.SolrDispatchFilter: null:ClientAbortException:  java.net.SocketException: Broken pipe

2016-03-13 07:47:55,099 ERROR org.apache.solr.servlet.SolrDispatchFilter: null:ClientAbortException:  java.net.SocketException: Broken pipe

 

 

 

=============================================================================

 

16/03/13 07:56:56 INFO zookeeper.ClientCnxn: Client session timed out, have not heard from server in 8684ms for sessionid 0x15363930bc40fa2, closing socket connection and attempting reconnect
16/03/13 07:56:56 INFO cloud.ConnectionManager: Watcher org.apache.solr.common.cloud.ConnectionManager@56817060 name:ZooKeeperConnection Watcher:DATANODE2:2181/solr got event WatchedEvent state:Disconnected type:None path:null path:null type:None
16/03/13 07:56:56 INFO cloud.ConnectionManager: zkClient has disconnected

 

 

===============================================================================

 

2016-03-15 13:27:59,377 ERROR org.apache.solr.update.HdfsTransactionLog: Exception closing tlog.
2016-03-15 13:27:59,398 ERROR org.apache.solr.update.CommitTracker: auto commit error...:org.apache.solr.common.SolrException: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[xxx.xx.xx.206:50010,DS-60186d18-fec0-4fe0-8d37-97375b2fa2ba,DISK], DatanodeInfoWithStorage[xxx.xx.xx.207:50010,DS-1970a8bd-d7ef-4b6b-82d6-62039216147f,DISK]], original=[DatanodeInfoWithStorage[xxx.xx.xx.206:50010,DS-60186d18-fec0-4fe0-8d37-97375b2fa2ba,DISK], DatanodeInfoWithStorage[xxx.xx.xx.207:50010,DS-1970a8bd-d7ef-4b6b-82d6-62039216147f,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.

 

================================================================================

 

 

 

null:ClientAbortException:  java.net.SocketException: Broken pipe
at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:330)
at org.apache.catalina.connector.OutputBuffer.flush(OutputBuffer.java:296)
at org.apache.catalina.connector.CoyoteOutputStream.flush(CoyoteOutputStream.java:98)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:297)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)
at org.apache.solr.util.FastWriter.flush(FastWriter.java:137)
at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:807)
at org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:777)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:262)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:211)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.solr.servlet.SolrHadoopAuthenticationFilter$2.doFilter(SolrHadoopAuthenticationFilter.java:288)
at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)
at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:291)
at org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:555)
at org.apache.solr.servlet.SolrHadoopAuthenticationFilter.doFilter(SolrHadoopAuthenticationFilter.java:293)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.solr.servlet.HostnameFilter.doFilter(HostnameFilter.java:86)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:620)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
at org.apache.coyote.http11.InternalOutputBuffer.realWriteBytes(InternalOutputBuffer.java:761)
at org.apache.tomcat.util.buf.ByteChunk.flushBuffer(ByteChunk.java:448)
at org.apache.coyote.http11.InternalOutputBuffer.flush(InternalOutputBuffer.java:318)
at org.apache.coyote.http11.Http11Processor.action(Http11Processor.java:987)
at org.apache.coyote.Response.action(Response.java:186)
at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:325)
... 32 more
 

 

 

============================================================================

 

ERROR org.apache.solr.core.SolrCore: org.apache.solr.common.SolrException: no servers hosting shard:

 

 

 

This is working well untill i reach 1 billion data(110gb of index size)

 

 

Announcements