Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Solr installation

avatar
Guru

Team,

I am new to Solr and want to install it in my cluster (5 nodes),before I go ahead I have got few questions.So can someone please help me on it.

1. Do I need to install solr on all nodes including master and workers ?

2. Can't we monitor it via Ambari ?

3. How we will configure Ranger Security on top of Solr ?

Note: I want to install solr in cloud mode(SolrCloud).

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Saurabh Kumar

The error which are you getting is :

"Unable to create core [test_shard1_replica1] Caused by: Direct buffer memory"} "

Looks to me that you have set up the Direct Memory ( to enable Block Cache ) as true in the "solrconfig.xml" file i.e.

<bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>

From your "solrconfig.xml", I see the config as:

<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">
<str name="solr.hdfs.home">hdfs://m1.hdp22:8020/user/solr</str>
<str name="solr.hdfs.confdir">/etc/hadoop/conf</str>
<bool name="solr.hdfs.blockcache.enabled">true</bool>
<int name="solr.hdfs.blockcache.slab.count">1</int>
<bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
<int name="solr.hdfs.blockcache.blocksperbank">16384</int>
<bool name="solr.hdfs.blockcache.read.enabled">true</bool>
<bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
<int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
<int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int>
</directoryFactory>

I will suggest to turn off the Direct Memory if you do not plan to use it for now and then try the creation of collection.

To disable it, edit the "solrconfig.xml" and looks for property - "solr.hdfs.blockcache.direct.memory.allocation".

Make the value of this property to "false" i.e.

<bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool>

The final "solrconfig.xml" will therefore look like :

                <directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">                  <str name="solr.hdfs.home">hdfs://m1.hdp22:8020/user/solr</str>
                <bool name="solr.hdfs.blockcache.enabled">true</bool>
                <int name="solr.hdfs.blockcache.slab.count">1</int>
                <bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool>
                <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
                <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
                <bool name="solr.hdfs.blockcache.write.enabled">false</bool>
                <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
                <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
                <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int>
                </directoryFactory>

View solution in original post

13 REPLIES 13

avatar
Super Collaborator

Hi Saurabh, here is a partial response in case it's helpful: HDP Search (which includes Solr) should be deployed on all nodes that run HDFS. Ambari is not supported quite yet. The HDP Search Guide contains basic information and links to additional documentation.

avatar
Guru

Thanks a lot @lgeorge. So you mean to say If I have 5(2M+3W) nodes cluster then I have to install Solr on all 3 worker nodes only or I need to install on 1 master and all 3 worker nodes as a part of server & client.

Also I would be very thankful if can you please help me to get following answer as well.

1. If I will install HDP-Search and it has banana so whether banana will run as root user or banana user ?

2. Can we do ldap integration for both Solr and Banana UI ?

3. How many resources we need for HDP search like heap,RAM or CPU ?

Thanks in advance.

avatar
Super Collaborator

Good questions. Not sure, but I'm checking. If I find answers I'll post them (or send the Solr expert this way 🙂

avatar
Guru

Thanks @lgeorge.

avatar
Expert Contributor
@Saurabh Kumar

1. Solr does not follow Master - Slave model, rather its Leader - Follower model.

Each Solr node therefore will be used for Indexing/Query, in SolrCloud.

Considering that you have 5 nodes, the Solr Collection creation therefore, can be done with 2 Shards and RF (Replication Factor ) of 2. This will allow to use 4 nodes for Solr.

2. Each node which is supposed to be used for Solr, need to be installed with "lucidworks-hdpsearch".

3. Resource usage depends on the Size of Index ( present and estimated growth of index ). Refer following for further understanding on resource usage:

https://wiki.apache.org/solr/SolrPerformanceProblems

avatar
Guru

Thanks @Ravi

avatar
Guru

@Ravi Can you please help me how to setup buffer memory for my solr cluster. I am getting following error.

[solr@m1 solr]$ /opt/lucidworks-hdpsearch/solr/bin/solr create -c test -d /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf -n test -s 2 -rf 2

Connecting to ZooKeeper at m1.hdp22:2181,m2.hdp22:2181,w1.hdp22:2181

Uploading /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf for config test to ZooKeeper at m1.hdp22:2181,m2.hdp22:2181,w1.hdp22:2181

Creating new collection 'test' using command:

http://192.168.56.42:8983/solr/admin/collections?action=CREATE&name=test&numShards=2&replicationFact...

{

"responseHeader":{

"status":0,

"QTime":4812},

"failure":{"":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://192.168.56.41:8983/solr: Error CREATEing SolrCore 'test_shard1_replica1': Unable to create core [test_shard1_replica1] Caused by: Direct buffer memory"},

"success":{"":{

"responseHeader":{

"status":0,

"QTime":4659},

"core":"test_shard2_replica1"}}}

avatar
Expert Contributor

@Saurabh Kumar

The error which are you getting is :

"Unable to create core [test_shard1_replica1] Caused by: Direct buffer memory"} "

Looks to me that you have set up the Direct Memory ( to enable Block Cache ) as true in the "solrconfig.xml" file i.e.

<bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>

From your "solrconfig.xml", I see the config as:

<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">
<str name="solr.hdfs.home">hdfs://m1.hdp22:8020/user/solr</str>
<str name="solr.hdfs.confdir">/etc/hadoop/conf</str>
<bool name="solr.hdfs.blockcache.enabled">true</bool>
<int name="solr.hdfs.blockcache.slab.count">1</int>
<bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
<int name="solr.hdfs.blockcache.blocksperbank">16384</int>
<bool name="solr.hdfs.blockcache.read.enabled">true</bool>
<bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
<int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
<int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int>
</directoryFactory>

I will suggest to turn off the Direct Memory if you do not plan to use it for now and then try the creation of collection.

To disable it, edit the "solrconfig.xml" and looks for property - "solr.hdfs.blockcache.direct.memory.allocation".

Make the value of this property to "false" i.e.

<bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool>

The final "solrconfig.xml" will therefore look like :

                <directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">                  <str name="solr.hdfs.home">hdfs://m1.hdp22:8020/user/solr</str>
                <bool name="solr.hdfs.blockcache.enabled">true</bool>
                <int name="solr.hdfs.blockcache.slab.count">1</int>
                <bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool>
                <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
                <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
                <bool name="solr.hdfs.blockcache.write.enabled">false</bool>
                <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
                <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
                <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int>
                </directoryFactory>

avatar
Guru

@Ravi: Thanks a lot, It helped me to avoid direct memory issue but now I encountered another issue, so can you please help me on this also.

[solr@m1 solr]$ /opt/lucidworks-hdpsearch/solr/bin/solr create -c test -d /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf -n test -s 2 -rf 2

Connecting to ZooKeeper at m1.hdp22:2181,m2.hdp22:2181,w1.hdp22:2181

Uploading /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf for config test to ZooKeeper at m1.hdp22:2181,m2.hdp22:2181,w1.hdp22:2181

Creating new collection 'test' using command:

http://192.168.56.41:8983/solr/admin/collections?action=CREATE&name=test&numShards=2&replicationFact...

{

"responseHeader":{

"status":0,

"QTime":6299},

"failure":{"":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error from server at http://192.168.56.41:8983/solr: Error CREATEing SolrCore 'test_shard1_replica1': Unable to create core [test_shard1_replica1] Caused by: Java heap space"},

"success":{"":{

"responseHeader":{

"status":0,

"QTime":5221},

"core":"test_shard2_replica1"}}}