Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

SOLR V6.4.1 Unable to Create collection when using HDFS and referencing HDFS using NameService name

avatar
Contributor

Following an upgrade to SOLR 6.4.1, it appears that access to HDFS via a NameService name (High Availability) is no longer working.

We have a solrconfig.xml which defines a HDFSDirectoryFactory as follows:

 <directoryFactory name="DirectoryFactory"
                    class="${solr.directoryFactory:solr.HdfsDirectoryFactory}">

  <str name="solr.hdfs.home">hdfs://XXXXHDPDEV1/data/DEV/solr</str>
  <bool name="solr.hdfs.blockcache.enabled">true</bool>
  <int name="solr.hdfs.blockcache.slab.count">32</int>
  <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
  <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
  <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
  <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
  <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
  <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int>
  <str name="solr.hdfs.confdir">/etc/hadoop/conf/</str>
</directoryFactory>

In this definition the value of solr.hdfs.home is hdfs://XXXXHDPDEV1/data/DEV/solr, where XXXXHDPDEV1 is the nameService name for a Hadoop cluster.

To enable this form of reference to the Hadoop cluster, we also include solr.hdfs.confdir which identifies a local directory that contains the Hadoop config files such as hdfs-site.xml These files map the nameservice name to multiple name nodes and should allow the HDFS client to discover the active name node. Using this nameservice name works fine when using command-line hdfs commands from the same SOLR server.

Under V6.4.1, when we try to create a collection based on the config that contains this solrconfig.xml file, the HDFS objects are successfully created - but the CREATE COLLECTION fails because it fails to instantiate the Update Handler, solr.DirectUpdateHandler2. We get the following traceback:

2017-02-23 11:42:16.419 ERROR (qtp225493257-77) [c:aircargo s:shard1  x:aircargo_shard1_replica1] o.a.s.c.CoreContainer Error creating core [aircargo_shard1_replica1]: SolrCore 'aircargo_shard1_replica1' is not available due to init failure: Error Instantiating Update Handler, solr.DirectUpdateHandler2 failed to instantiate org.apache.solr.update.UpdateHandler
org.apache.solr.common.SolrException: SolrCore 'aircargo_shard1_replica1' is not available due to init failure: Error Instantiating Update Handler, solr.DirectUpdateHandler2 failed to instantiate org.apache.solr.update.UpdateHandler
        at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:1151)
        at org.apache.solr.cloud.ZkController.publish(ZkController.java:1198)
        at org.apache.solr.cloud.ZkController.preRegister(ZkController.java:1372)
        at org.apache.solr.core.CoreContainer.create(CoreContainer.java:885)
        at org.apache.solr.core.CoreContainer.create(CoreContainer.java:827)
        at org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:88)
        at org.apache.solr.handler.admin.CoreAdminOperation$$Lambda$28/50699452.execute(Unknown Source)
        at org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:377)
        at org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:379)
        at org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:165)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:166)
        at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:664)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:445)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:296)
        at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691)
        at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
        at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
        at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
        at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
        at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
        at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
        at org.eclipse.jetty.server.Server.handle(Server.java:534)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
        at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
        at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
        at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
        at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.solr.common.SolrException: Error Instantiating Update Handler, solr.DirectUpdateHandler2 failed to instantiate org.apache.solr.update.UpdateHandler
        at org.apache.solr.core.SolrCore.<init>(SolrCore.java:959)
        at org.apache.solr.core.SolrCore.<init>(SolrCore.java:823)
        at org.apache.solr.core.CoreContainer.create(CoreContainer.java:890)
        ... 36 more
Caused by: org.apache.solr.common.SolrException: Error Instantiating Update Handler, solr.DirectUpdateHandler2 failed to instantiate org.apache.solr.update.UpdateHandler
        at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:767)
        at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:815)
        at org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1065)
        at org.apache.solr.core.SolrCore.<init>(SolrCore.java:930)
        ... 38 more
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
        at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:753)
        ... 41 more
Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: DIBPHDPDEV1
        at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
        at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
        at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
        at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2653)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
        at org.apache.solr.update.HdfsUpdateLog.init(HdfsUpdateLog.java:145)
        at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:137)
        at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:94)
        at org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:102)
        ... 46 more
Caused by: java.net.UnknownHostException: XXXXHDPDEV1
        ... 58 more


Right at the end of the above log you will see UnknownHostException: XXXXHDPDEV1. It appears that the instantiation of the update handler thinks that the hadoop nameService name is a host name. We can avoid this error by hard-coding the server address and port of the active name node (e.g. XXXXn1:8020).

e.g.

<str name="solr.hdfs.home">hdfs://XXXXn1:8020/data/DEV/solr</str>

However, in the event of a name node switch, the collection becomes inaccessible.

Is this a bug in V6.4.1? (Note. This approach worked fine in V5.3.0)

There is a very similar problem reported in HDPSearch - failed to create collection - UnknownHostExceptionl However, this is from an earlier version and was solved by fixing a problem in uploading the config to zookeeper. (The fact that we can get our config to work by hard-coding the server name, suggests that we have our zookeeper update process under control.)

1 ACCEPTED SOLUTION

avatar
Contributor

I have duplicated this problem, and filed an issue in the Solr community: https://issues.apache.org/jira/browse/SOLR-10215. I don't know what is causing it, but it seems limited to Solr 6.4. I tried the same setup with Solr 6.3.0 and it worked fine.

If you don't mind, I'd like to post a comment to that issue with a link to this forum thread to show that others have had the same problem.

View solution in original post

6 REPLIES 6

avatar
Super Collaborator
@Tony Bolt

It seems like the solr.hdfs.confdir is not being used. It's unlikely to be the problem, have you checked permissions?

avatar
Contributor

Thanks @james.jones There are a series of symlinks involved but all the permissions look OK. All the directories and files are, at least, readable by all

avatar
Contributor

I have duplicated this problem, and filed an issue in the Solr community: https://issues.apache.org/jira/browse/SOLR-10215. I don't know what is causing it, but it seems limited to Solr 6.4. I tried the same setup with Solr 6.3.0 and it worked fine.

If you don't mind, I'd like to post a comment to that issue with a link to this forum thread to show that others have had the same problem.

avatar
Contributor

Thanks @Cassandra Targett

Very happy for you to include the link. I'm also happy to supply extra info and/or test a fix.

avatar
Contributor

Great, thanks @Tony Bolt. I was able to trace the cause of the problem to a seemingly unrelated commit that occurred for the 6.4.0 release, and the good news is the fix has already been committed for an upcoming 6.4.2 release. The release process for that has already started, and we'd expect it to be out within 1-2 weeks.

There is no patch to apply, but if you have the ability to build Solr from source, you could try to build locally with "branch_6_4", which is where 6.4.2 will come from, or "branch_6x", which also contains the same fix. If you can't do a local build for any reason, we do already have a 2nd confirmation that the problem is fixed with this upcoming release, so it's certainly not required or expected of you to test it at this point.

avatar
Contributor

Thanks @Cassandra Targett I am happy to wait for the 6.4.2 release. The team here are impressed with how fast this issue was resolved. Thanks for following this up for us.