Created 07-12-2016 12:24 PM
Team,
I am new to Solr and want to install it in my cluster (5 nodes),before I go ahead I have got few questions.So can someone please help me on it.
1. Do I need to install solr on all nodes including master and workers ?
2. Can't we monitor it via Ambari ?
3. How we will configure Ranger Security on top of Solr ?
Note: I want to install solr in cloud mode(SolrCloud).
Created 07-20-2016 07:28 PM
The error which are you getting is :
"Unable to create core [test_shard1_replica1] Caused by: Direct buffer memory"} "
Looks to me that you have set up the Direct Memory ( to enable Block Cache ) as true in the "solrconfig.xml" file i.e.
<bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
From your "solrconfig.xml", I see the config as:
<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory"> <str name="solr.hdfs.home">hdfs://m1.hdp22:8020/user/solr</str> <str name="solr.hdfs.confdir">/etc/hadoop/conf</str> <bool name="solr.hdfs.blockcache.enabled">true</bool> <int name="solr.hdfs.blockcache.slab.count">1</int> <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool> <int name="solr.hdfs.blockcache.blocksperbank">16384</int> <bool name="solr.hdfs.blockcache.read.enabled">true</bool> <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool> <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int> <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int> </directoryFactory>
I will suggest to turn off the Direct Memory if you do not plan to use it for now and then try the creation of collection.
To disable it, edit the "solrconfig.xml" and looks for property - "solr.hdfs.blockcache.direct.memory.allocation".
Make the value of this property to "false" i.e.
<bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool>
The final "solrconfig.xml" will therefore look like :
<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory"> <str name="solr.hdfs.home">hdfs://m1.hdp22:8020/user/solr</str> <bool name="solr.hdfs.blockcache.enabled">true</bool> <int name="solr.hdfs.blockcache.slab.count">1</int> <bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool> <int name="solr.hdfs.blockcache.blocksperbank">16384</int> <bool name="solr.hdfs.blockcache.read.enabled">true</bool> <bool name="solr.hdfs.blockcache.write.enabled">false</bool> <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool> <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int> <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int> </directoryFactory>
Created 07-12-2016 06:12 PM
Hi Saurabh, here is a partial response in case it's helpful: HDP Search (which includes Solr) should be deployed on all nodes that run HDFS. Ambari is not supported quite yet. The HDP Search Guide contains basic information and links to additional documentation.
Created 07-13-2016 12:04 PM
Thanks a lot @lgeorge. So you mean to say If I have 5(2M+3W) nodes cluster then I have to install Solr on all 3 worker nodes only or I need to install on 1 master and all 3 worker nodes as a part of server & client.
Also I would be very thankful if can you please help me to get following answer as well.
1. If I will install HDP-Search and it has banana so whether banana will run as root user or banana user ?
2. Can we do ldap integration for both Solr and Banana UI ?
3. How many resources we need for HDP search like heap,RAM or CPU ?
Thanks in advance.
Created 07-15-2016 03:25 PM
Good questions. Not sure, but I'm checking. If I find answers I'll post them (or send the Solr expert this way 🙂
Created 07-20-2016 07:08 AM
Thanks @lgeorge.
Created 07-17-2016 07:46 PM
1. Solr does not follow Master - Slave model, rather its Leader - Follower model.
Each Solr node therefore will be used for Indexing/Query, in SolrCloud.
Considering that you have 5 nodes, the Solr Collection creation therefore, can be done with 2 Shards and RF (Replication Factor ) of 2. This will allow to use 4 nodes for Solr.
2. Each node which is supposed to be used for Solr, need to be installed with "lucidworks-hdpsearch".
3. Resource usage depends on the Size of Index ( present and estimated growth of index ). Refer following for further understanding on resource usage:
Created 07-20-2016 07:08 AM
Thanks @Ravi
Created 07-20-2016 10:01 AM
@Ravi Can you please help me how to setup buffer memory for my solr cluster. I am getting following error.
[solr@m1 solr]$ /opt/lucidworks-hdpsearch/solr/bin/solr create -c test -d /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf -n test -s 2 -rf 2
Connecting to ZooKeeper at m1.hdp22:2181,m2.hdp22:2181,w1.hdp22:2181
Uploading /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf for config test to ZooKeeper at m1.hdp22:2181,m2.hdp22:2181,w1.hdp22:2181
Creating new collection 'test' using command:
{
"responseHeader":{
"status":0,
"QTime":4812},
"failure":{"":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://192.168.56.41:8983/solr: Error CREATEing SolrCore 'test_shard1_replica1': Unable to create core [test_shard1_replica1] Caused by: Direct buffer memory"},
"success":{"":{
"responseHeader":{
"status":0,
"QTime":4659},
"core":"test_shard2_replica1"}}}
Created 07-20-2016 07:28 PM
The error which are you getting is :
"Unable to create core [test_shard1_replica1] Caused by: Direct buffer memory"} "
Looks to me that you have set up the Direct Memory ( to enable Block Cache ) as true in the "solrconfig.xml" file i.e.
<bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
From your "solrconfig.xml", I see the config as:
<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory"> <str name="solr.hdfs.home">hdfs://m1.hdp22:8020/user/solr</str> <str name="solr.hdfs.confdir">/etc/hadoop/conf</str> <bool name="solr.hdfs.blockcache.enabled">true</bool> <int name="solr.hdfs.blockcache.slab.count">1</int> <bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool> <int name="solr.hdfs.blockcache.blocksperbank">16384</int> <bool name="solr.hdfs.blockcache.read.enabled">true</bool> <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool> <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int> <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int> </directoryFactory>
I will suggest to turn off the Direct Memory if you do not plan to use it for now and then try the creation of collection.
To disable it, edit the "solrconfig.xml" and looks for property - "solr.hdfs.blockcache.direct.memory.allocation".
Make the value of this property to "false" i.e.
<bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool>
The final "solrconfig.xml" will therefore look like :
<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory"> <str name="solr.hdfs.home">hdfs://m1.hdp22:8020/user/solr</str> <bool name="solr.hdfs.blockcache.enabled">true</bool> <int name="solr.hdfs.blockcache.slab.count">1</int> <bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool> <int name="solr.hdfs.blockcache.blocksperbank">16384</int> <bool name="solr.hdfs.blockcache.read.enabled">true</bool> <bool name="solr.hdfs.blockcache.write.enabled">false</bool> <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool> <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int> <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int> </directoryFactory>
Created 07-21-2016 07:32 AM
@Ravi: Thanks a lot, It helped me to avoid direct memory issue but now I encountered another issue, so can you please help me on this also.
[solr@m1 solr]$ /opt/lucidworks-hdpsearch/solr/bin/solr create -c test -d /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf -n test -s 2 -rf 2
Connecting to ZooKeeper at m1.hdp22:2181,m2.hdp22:2181,w1.hdp22:2181
Uploading /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf for config test to ZooKeeper at m1.hdp22:2181,m2.hdp22:2181,w1.hdp22:2181
Creating new collection 'test' using command:
{
"responseHeader":{
"status":0,
"QTime":6299},
"failure":{"":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://192.168.56.41:8983/solr: Error CREATEing SolrCore 'test_shard1_replica1': Unable to create core [test_shard1_replica1] Caused by: Java heap space"},
"success":{"":{
"responseHeader":{
"status":0,
"QTime":5221},
"core":"test_shard2_replica1"}}}