Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Installing Solr on yarn using Slider

avatar
Contributor

Hi,

I am trying to run solr on yarn using the link lucidworksSolrSlider, apart from taking help from slider.incubator.apache.org/docs/getting_started.html

Here is my folder structure:

[solrs@ip-10-0-0-217 solr-slider]$ ls -lrt
total 131744
-rw-rw-r--. 1 solrs solrs      3182 Dec 10 01:17 README.md
drwxrwxr-x. 4 solrs solrs        32 Dec 10 01:17 package
-rw-rw-r--. 1 solrs solrs      2089 Dec 10 01:17 metainfo.xml
-rw-rw-r--. 1 solrs solrs     11358 Dec 10 01:17 LICENSE
-rw-rw-r--. 1 solrs solrs 134874517 Dec 10 01:37 solr-on-yarn.zip
-rw-rw-r--. 1 solrs solrs       277 Dec 10 01:49 resources-default.json
-rw-rw-r--. 1 solrs solrs      1355 Dec 10 15:33 appConfig-default.json

appConfig-default.json:

{
  "schema": "http://example.org/specification/v2.0.0",
  "metadata": {
  },
  "global": {
    "application.def": "/user/solrs/.slider/package/solryarn/solr-on-yarn.zip",
    "java_home": "/usr/jdk64/jdk1.8.0_40",
    "site.global.app_root": "${AGENT_WORK_ROOT}/app/install/solr-5.2.0-SNAPSHOT",
    "site.global.zk_host": "localhost:2181",
    "site.global.solr_host": "${SOLR_HOST}",
    "site.global.listen_port": "${SOLR.ALLOCATED_PORT}",
    "site.global.xmx_val": "1g",
    "site.global.xms_val": "1g",
    "site.global.gc_tune": "-XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewG$
    "site.global.zk_timeout": "15000",
    "site.global.server_module": "--module=http",
    "site.global.stop_key": "solrrocks",
    "site.global.solr_opts": ""
  },
  "components": {
    "slider-appmaster": {
      "jvm.heapsize": "512M"
    },
    "SOLR": {
    }
  }
}

resources-default.json:

{
  "schema" : "http://example.org/specification/v2.0.0",
  "metadata" : {
  },
  "global" : {
  },
  "components": {
    "slider-appmaster": {
    },
    "SOLR": {
      "yarn.role.priority": "1",
      "yarn.component.instances": "3",
      "yarn.memory": "1024"
    }
  }
}

Could you please suggest me what will be the value of below parameters in appConfig-default.json file:

"site.global.app_root": "${AGENT_WORK_ROOT}/app/install/solr-5.2.0-SNAPSHOT",
"site.global.solr_host": "${SOLR_HOST}",
"site.global.listen_port": "${SOLR.ALLOCATED_PORT}",

Basically where should I find "/app/install/solr-5.2.0-SNAPSHOT"??

My Environment: HDP 2.3, Slider Core-0.80.0.2.3.2.0-2950

Thanks, hoping a quick reply.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

The only part of "site.global.app_root": "${AGENT_WORK_ROOT}/app/install/solr-5.2.0-SNAPSHOT" that you should change is the solr-5.2.0-SNAPSHOT. You should make this match the version of the Solr tarball you downloaded. (You can check the version by running "tar tf solr.tgz").

You probably also want to change "site.global.zk_host": "localhost:2181" to "site.global.zk_host": "${ZK_HOST}", which will configure Solr to use the same ZooKeeper instance Slider is using.

I think you can leave ${SOLR_HOST} as is, but I am not completely sure of the purpose of that parameter.

View solution in original post

12 REPLIES 12

avatar
Contributor

@Gour Saha

avatar
Expert Contributor

The only part of "site.global.app_root": "${AGENT_WORK_ROOT}/app/install/solr-5.2.0-SNAPSHOT" that you should change is the solr-5.2.0-SNAPSHOT. You should make this match the version of the Solr tarball you downloaded. (You can check the version by running "tar tf solr.tgz").

You probably also want to change "site.global.zk_host": "localhost:2181" to "site.global.zk_host": "${ZK_HOST}", which will configure Solr to use the same ZooKeeper instance Slider is using.

I think you can leave ${SOLR_HOST} as is, but I am not completely sure of the purpose of that parameter.

avatar
Expert Contributor

The directory ${AGENT_WORK_ROOT}/app/install/solr-* will be created for you by Slider. Slider will untar your Solr tarball to the ${AGENT_WORK_ROOT}/app/install directory. That's why Slider needs to know the name of the directory contained in your tarball.

avatar
Rising Star

Do you think Solr on YARN is ready for a PoC?

avatar
Contributor

Thanks for the response, but the slider application failed to start again.

When I look at the HDFS path:

[solr@sandbox solr-slider]$ hadoop fs -cat /user/solr/.slider/cluster/solr-yarn4/app_config.json
{
  "schema" : "http://example.org/specification/v2.0.0",
  "metadata" : { },
  "global" : {
    "site.global.gc_tune" : "-XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime",
    "site.fs.default.name" : "hdfs://sandbox.hortonworks.com:8020",
    "site.global.solr_host" : "${SOLR_HOST}",
    "site.global.solr_opts" : "",
    "zookeeper.hosts" : "sandbox.hortonworks.com",
    "site.global.server_module" : "--module=http",
    "site.global.stop_key" : "solrrocks",
    "java_home" : "/usr/lib/jvm/java-1.7.0-openjdk.x86_64/",
    "site.fs.defaultFS" : "hdfs://sandbox.hortonworks.com:8020",
    "site.global.zk_timeout" : "15000",
    "env.MALLOC_ARENA_MAX" : "4",
    "zookeeper.path" : "/services/slider/users/solr/solr-yarn4",
    "site.global.listen_port" : "8983",
    "zookeeper.quorum" : "sandbox.hortonworks.com:2181",
    "site.global.xmx_val" : "1g",
    "site.global.zk_host" : "${ZK_HOST}",
    "site.global.app_root" : "${AGENT_WORK_ROOT}/app/install/solr-5.3.1-SNAPSHOT",
    "application.def" : "/user/solr/.slider/package/solr-yarn/solr-on-yarn.zip",
    "site.global.xms_val" : "1g"
  },
  "credentials" : { },
  "components" : {
    "slider-appmaster" : {
      "jvm.heapsize" : "512M"
    },
    "SOLR" : { }
  }


- The variable names "${ZK_HOST}" shoul nt they be replaced with actual values?

- Where should I look for the Solr specific logs as I am not able to find anything in the container logs.

- What is the value of ${AGENT_WORK_ROOT}? what is the absolute path?

- Is there any detailed documentation on how to deploy Solr application on yarn via Slider.

Regards,

avatar
Expert Contributor

ZK_HOST and AGENT_WORK_ROOT will be replaced by Slider. The AGENT_WORK_ROOT will have the form /hadoop/yarn/local/usercache/<userName>/appcache/<appID>/<containerID> (where /hadoop/yarn/local is the directory specified by the yarn.nodemanager.local-dirs in yarn-site.xml). Based on the solr_node.py script, it looks like the output of the Solr start command should end up in the slider-agent logs in the container log directory. If containers are failing to launch, information about that should be in the AM log, slider.log in the log directory for container 0001.

avatar
Expert Contributor

I've added a comment to my initial response that should solve your problem.

avatar
Expert Contributor

Another thing I noticed is that memory requested is pretty high if you're going to be running it on a VM. It might not be launching Solr because it doesn't have enough memory. I made the these changes to appConfig and resources and was able to get Solr running on a VM that has 9GB of RAM. You might need to make additional adjustments for your setup, and also make sure yarn.scheduler.minimum-allocation-mb isn't too high.

avatar
Contributor

Thanks Billie for your response!

I was able run solr on yarn, the mistake was "site.global.app_root" did not have the correct name of my solr version which was solr-5.3.1

However when I stop the solr application via slider (slider stop solr-yarn8) and restart it,

1) the cores I created disappear and, which is bad.

2) new instances start on new ports, can I fix the ports?

3) also I am only able to connect to only one of the solr instances (solr UI).

4) Is it yet possible to deploy solr cloud on yarn using multiple instances of solr?

Regards,

Rakesh