Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Solr installation

avatar
Guru

Team,

I am new to Solr and want to install it in my cluster (5 nodes),before I go ahead I have got few questions.So can someone please help me on it.

1. Do I need to install solr on all nodes including master and workers ?

2. Can't we monitor it via Ambari ?

3. How we will configure Ranger Security on top of Solr ?

Note: I want to install solr in cloud mode(SolrCloud).

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Saurabh Kumar

The error which are you getting is :

"Unable to create core [test_shard1_replica1] Caused by: Direct buffer memory"} "

Looks to me that you have set up the Direct Memory ( to enable Block Cache ) as true in the "solrconfig.xml" file i.e.

<bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>

From your "solrconfig.xml", I see the config as:

<directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">
<str name="solr.hdfs.home">hdfs://m1.hdp22:8020/user/solr</str>
<str name="solr.hdfs.confdir">/etc/hadoop/conf</str>
<bool name="solr.hdfs.blockcache.enabled">true</bool>
<int name="solr.hdfs.blockcache.slab.count">1</int>
<bool name="solr.hdfs.blockcache.direct.memory.allocation">true</bool>
<int name="solr.hdfs.blockcache.blocksperbank">16384</int>
<bool name="solr.hdfs.blockcache.read.enabled">true</bool>
<bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
<int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
<int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int>
</directoryFactory>

I will suggest to turn off the Direct Memory if you do not plan to use it for now and then try the creation of collection.

To disable it, edit the "solrconfig.xml" and looks for property - "solr.hdfs.blockcache.direct.memory.allocation".

Make the value of this property to "false" i.e.

<bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool>

The final "solrconfig.xml" will therefore look like :

                <directoryFactory name="DirectoryFactory" class="solr.HdfsDirectoryFactory">                  <str name="solr.hdfs.home">hdfs://m1.hdp22:8020/user/solr</str>
                <bool name="solr.hdfs.blockcache.enabled">true</bool>
                <int name="solr.hdfs.blockcache.slab.count">1</int>
                <bool name="solr.hdfs.blockcache.direct.memory.allocation">false</bool>
                <int name="solr.hdfs.blockcache.blocksperbank">16384</int>
                <bool name="solr.hdfs.blockcache.read.enabled">true</bool>
                <bool name="solr.hdfs.blockcache.write.enabled">false</bool>
                <bool name="solr.hdfs.nrtcachingdirectory.enable">true</bool>
                <int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">16</int>
                <int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">192</int>
                </directoryFactory>

View solution in original post

13 REPLIES 13

avatar
Guru
@Ravi

Hey Ravi, thanks I have solved it by changing value of SOLR_HEAP to 1024 MB in /opt/lucidworks-hdpsearch/solr/bin/solr.in.sh. Thanks once again for all your help.

SOLR_HEAP="1024m"

[solr@m1 solr]$ /opt/lucidworks-hdpsearch/solr/bin/solr create -c test -d /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf -n test -s 2 -rf 2

Connecting to ZooKeeper at m1.hdp22:2181,m2.hdp22:2181,w1.hdp22:2181

Uploading /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf for config test to ZooKeeper at m1.hdp22:2181,m2.hdp22:2181,w1.hdp22:2181

Creating new collection 'test' using command:

http://192.168.56.42:8983/solr/admin/collections?action=CREATE&name=test&numShards=2&replicationFact...

{

"responseHeader":{

"status":0,

"QTime":8494},

"success":{"":{

"responseHeader":{

"status":0,

"QTime":8338},

"core":"test_shard1_replica1"}}}

avatar
Expert Contributor

@Saurabh Kumar

You are welcome.

For the issue with Java heap space , its due to Java_Heap for Solr Process. By default Solr process is started with only 512MB. We can increase this by editing the Solr config files or via solr command line options as:

/opt/lucidworks-hdpsearch/solr/bin/solr -m 2g create -c test -d /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf -n test -s 2 -rf 2

This will resolve the Java heap space issue.

avatar
Guru

Thanks @Ravi. I have solved it by changing value of SOLR_HEAP to 1024 MB in /opt/lucidworks-hdpsearch/solr/bin/solr.in.sh. Thanks once again for all your help.

SOLR_HEAP="1024m"

[solr@m1 solr]$ /opt/lucidworks-hdpsearch/solr/bin/solr create -c test -d /opt/lucidworks-hdpsearch/solr/server/solr/configsets/data_driven_schema_configs_hdfs/conf -n test -s 2 -rf 2

avatar
New Contributor

Hi Saurabh,

I know this is an older question, but if you (or anyone else) is still looking to monitor Solr Cloud via Ambari, laying a custom service on top of your existing installation might be useful. The following will allow you to integrate Solr Cloud into Ambari, complete with alerts and the ability to start, stop, and monitor status.

This setup assumes an existing, standard Solr Cloud installation, with the Sold Cloud UI available on port 8983.


On the Ambari node, create /var/lib/ambari-server/resources/stacks/HDP/2.0.6/services/SOLR/package/scripts.

In /var/lib/ambari-server/resources/stacks/HDP/2.0.6/services/SOLR, create alerts.json and metainfo.xml, as follows (you can, of course, change the version to whatever version of Solr you have installed):

alerts.json

{
  "SOLR": {
    "service": [],
    "SOLR_CLOUD": [
      {
        "name" : "solr_cloud_ui",
        "label" : "Solr Cloud UI",
        "description" : "This host-level alert is triggered if the Solr Cloud Web UI is unreachable.",
        "interval" : 1,
        "scope" : "ANY",
        "source" : {
          "type" : "WEB",
          "uri" : {
            "http" : "http://0.0.0.0:8983",
            "connection_timeout" : 5.0
          },
          "reporting" : {
            "ok" : {
              "text" : "HTTP {0} response in {2:.3f}s"
            },
            "warning" : {
              "text" : "HTTP {0} response from {1} in {2:.3f}s ({3})"
            },
            "critical" : {
              "text" : "Connection failed to {1} ({3})"
            }
          }
        }
      }
    ]
  }
}

metainfo.xml

<?xml version="1.0"?>
<metainfo>
  <schemaVersion>2.0</schemaVersion>
  <services>
    <service>
      <name>SOLR</name>
      <displayName>Solr</displayName>
      <comment>Solr is an open source enterprise search platform, written in Java, from the Apache Lucene project.</comment>
      <version>5.2.1</version>
      <components>
        <component>
          <name>SOLR_CLOUD</name>
          <displayName>Solr Cloud Server</displayName>
          <category>MASTER</category>
          <cardinality>1+</cardinality>
          <commandScript>
            <script>scripts/solrcloud.py</script>
            <scriptType>PYTHON</scriptType>
            <timeout>600</timeout>
          </commandScript>
        </component>
      </components>
    </service>
  </services>
</metainfo>

In /var/lib/ambari-server/resources/stacks/HDP/2.0.6/services/SOLR/package/scripts, create params.py and solrcloud.py, as follows:

params.py

cloud_stop = ('/sbin/service', 'solr', 'stop')
cloud_start = ('/sbin/service', 'solr', 'start')
cloud_pid_file = '/opt/lucidworks-hdpsearch/solr/bin/solr-8983.pid'

solrcloud.py

from resource_management import *
from resource_management.core.resources.system import Execute

class Master(Script):
  def install(self, env):
    print 'Installing Solr Cloud';

  def stop(self, env):
    import params
    env.set_params(params)

    Execute((params.cloud_stop), sudo=True)

  def start(self, env):
    import params
    env.set_params(params)

    Execute((params.cloud_start), sudo=True)

  def status(self, env):
    import params
    env.set_params(params)

    from resource_management.libraries.functions import check_process_status

    check_process_status(params.cloud_pid_file)

  def configure(self, env):
    print 'Configuring Solr Cloud';

if __name__ == "__main__":
  Master().execute()

At this point, after restarting Ambari, you will be able to "install" Solr Cloud via the Ambari Add Service wizard, specifying a Solr Cloud Server on whichever hosts Solr is already installed. As you might note from solrcloud.py, the installation doesn't do anything other than configure Ambari to be aware that the components exist on the hosts.

Once the installation is complete, Solr will be listed as an Ambari Service, with each Solr Cloud server listed as an individual Master component.

Hope this helps.

Joe