Created 07-02-2016 06:14 AM
Hello,
I am trying to setup and configure HDPSearch. I have 4 solr boxes running 6 instances of solr. I have setup HDFS with NN HA. All 4 boxes can successfully reach HDFS using the NN HA name.
However, I am receiving the below error when trying to create a collection in solr. What is solr missing that it can't connect to HDFS?
126330 ERROR (qtp59559151-22) [c:collection s:shard23 r:core_node86 x:collection_shard23_replica3] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error CREATEing SolrCore 'collection_shard23_replica3': Unable to create core [collection_shard23_replica3] Caused by: NN_HA_Name. .. 31 more Caused by: java.net.UnknownHostException: NN_HA_Name ... 45 more
Here is the command to start solr cloud:
solr -c -p 8983 -z $zk_quorum:2181/solr -Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs-Dsolr.hdfs.home=hdfs://NN_HA_Name/apps/solr
Here is the command to create the collection:
solr create -c collection -d collection -n collection -s 48 -rf 3
Here are my solrconfig.xml DirectoryFactory Settings:
<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory"> <str name="solr.hdfs.home">hdfs://NN_HA_Name/apps/solr</str> <str name="solr.hdfs.confdir">/etc/hadoop/conf</str> <bool name="solr.hdfs.blockcache.enabled">true</bool> <int name="solr.hdfs.blockcache.slab.count">1</int> <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool> <int name="solr.hdfs.blockcache.blocksperbank">16384</int> <bool name="solr.hdfs.blockcache.read.enabled">true</bool> <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool> <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int> <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int> </directoryFactory>
I have installed the hdfs clients on the solr nodes and can successfully
hdfs dfs -ls hdfs://NN_HA_Name/apps/solr
I also see core-site.xml and hdfs-site.xml (with the correct NN configurations) in the /etc/hadoop/conf directory.
Thanks, Jon
Created 07-03-2016 06:10 AM
After more digging, I discovered the solrconfig.xml in ZK was not the correct version. I did a series of downconfig and upconfig to load the correct configs and verify everything is OK. After loading the correct solrconfig.xml and restarting each solr node, the create collection command succeeded.
/opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -cmd downconfig -d collection -z $zk_quorum:2181/solr -n collection /opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -d $path_to_configs -z $zk_quorum:2181/solr -n collection
Created 07-03-2016 01:05 AM
@Jon Maestas Are you simply testing solr? as a general practice for production I would not use hdfs with solr. I just SAS/SSD DAS storage and point local directories. Create replicas (3x) across your solr nodes. For your issue do you mind attaching log file?
Created 07-03-2016 06:17 AM
Hi @Sunile Manjee,
Thank you for your response. This is the documentation I followed to setup this environment: https://doc.lucidworks.com/lucidworks-hdpsearch/2.3/Guide-Install.html
I will be testing performance against HDFS indexing with NRT setup. I have local SSD disks setup as a fallback if this isn't fast enough or too unreliable.
Thanks,
Jon
Created 07-03-2016 06:10 AM
After more digging, I discovered the solrconfig.xml in ZK was not the correct version. I did a series of downconfig and upconfig to load the correct configs and verify everything is OK. After loading the correct solrconfig.xml and restarting each solr node, the create collection command succeeded.
/opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -cmd downconfig -d collection -z $zk_quorum:2181/solr -n collection /opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -d $path_to_configs -z $zk_quorum:2181/solr -n collection
Created 07-05-2016 01:37 AM
@Jon Maestas thanks for sharing. good stuff.
Created 02-21-2017 05:35 AM
@Jon Maestas I have hit this problem too, but simply re-upconfig'ing did not fix the issue: I get an unknownHostException on the Nameservice name specified in solr.hdfs.home. Did you get any further insight into what was going wrong and why re-executing the upconfig did the trick? (I have to say that the version of solrconfig.xml in zookeeper looks identical to my source version.)
Regards, Tony
Created 02-21-2017 03:39 PM
After you do the downconfig, do your configs look correct? If you're not upconfig'ing them to the correct location in ZK, solr won't see the correct version of your configs.
Also, check in the ZK CLI to make sure you're using the right znode. If you're znode isn't /solr, then you'll need to adjust the above commands appropriately. And make sure solr is looking in the right znode.
I believe my znode was /solr and my configs were in /solr/configs.
Created 02-22-2017 11:11 PM
I'm still having the same problem. I've tried clearing the config and upconfiging multiple times. In every instance the solrconfig.xml looks fine from the Solr UI.
The HDS stuff seems to be working OK. i.e. when I create the collection, the expected directories and files are created in HDFS. It is only after that, when SOLR tries to instantiate the updateHandler that we get the unKnownHostException refering to our HDFS Nameservice Name.
Unfortuanatley we changed multiple things going in here. Everything was working fine on Solr Version 5.3.1 and the embedded zookeeper. This problem has arisen when we went to SOLR 6.4.1 but we simultaneously switched to using the Hadoop cluster's existing zookeeper quorum.
We have the /solr chroot setup in Zookeeper and it is referenced consistently across all the Solr config files and commands. Our next step is to start backing out our changes (which is a pain becuase we want some of the security enhancements in 6.4.1
In your examples (above) you use $zk_quorum. IS that set to the name of a single zookeeper node (or is it a list of all the nodes) I've tried both approaches but it doesn't make any difference.
Thanks
Tony