New Contributor
Posts: 1
Registered: ‎01-10-2017

Unable to find schema.xml in zookeeper.

[ Edited ]

I'm following docs  to upload a csv file to solr.  My CDH version is 5.9.0. 


The morphline-file's spark locator I wrote is:


  # Name of solr collection
  collection : collection

  # ZooKeeper ensemble
  zkHost : ""



I'm run the job using:


spark-submit \
  --master yarn \
  --deploy-mode client \
  --jars $myDependencyJarFiles \
  --executor-memory 500M \
  --conf "spark.executor.extraJavaOptions=$myJVMOptions" \
  --driver-java-options "$myJVMOptions" \
  --class org.apache.solr.crunch.CrunchIndexerTool \
  $myDriverJar \
  --morphline-file /root/data/loadSolrLine.conf \
  --pipeline-type spark \
  --chatty \




And got the following error. 



Caused by: org.kitesdk.morphline.api.MorphlineCompilationException: Cannot download schema.xml from ZooKeeper near: {
# /data/0/yarn/nm/usercache/root/appcache/application_1487281871951_0004/container_1487281871951_0004_01_000002/tmp/org.apache.solr.crunch.MorphlineFnBuilder$MorphlineFn3206571973337697007.tmp: 3
# Name of solr collection
"collection" : "electric_collection",
# /data/0/yarn/nm/usercache/root/appcache/application_1487281871951_0004/container_1487281871951_0004_01_000002/tmp/org.apache.solr.crunch.MorphlineFnBuilder$MorphlineFn3206571973337697007.tmp: 6
# ZooKeeper ensemble
"zkHost" : ""


However, the schema.xml is in the zookeeper. (localhost is, so I don't think is will be a issue).



[zk: localhost:2181(CONNECTED) 19] ls /solr/configs/electric_collection
[currency.xml, mapping-FoldToASCII.txt, protwords.txt, scripts.conf, synonyms.txt, stopwords.txt, _schema_analysis_synonyms_english.json, velocity, admin-extra.html,, update-script.js, _schema_analysis_stopwords_english.json, solrconfig.xml,, elevate.xml, schema.xml, clustering, mapping-ISOLatin1Accent.txt, spellings.txt, _rest_managed.json, xslt, lang,]


So what I shall I provide to morphline so that it could find the schema.xml?



Posts: 642
Topics: 3
Kudos: 121
Solutions: 67
Registered: ‎08-16-2016

Re: Unable to find schema.xml in zookeeper.

I ran into this same error but nothing related to adding data or running jobs against Solr. It either couldn't start or CM kept failing to access the web server and api for its health check.

I didn't dig into it as it was a new cluster that was recently Kerberized and nothing was in Solr. I got it fixed by shutting down Solr and running the 'Initialize Solr' command in the Solr Service action menu in CM.

I can't attest to what that command does.

Good luck.
New Contributor
Posts: 4
Registered: ‎11-04-2016

Re: Unable to find schema.xml in zookeeper.



Did you find a solution for this? I am stuck at the same error.



Cloudera Employee
Posts: 277
Registered: ‎01-09-2014

Re: Unable to find schema.xml in zookeeper.

Be sure to set the solr chroot in your zkHost definition:


zkHost : ""