Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Unable to find schema.xml in zookeeper.

Unable to find schema.xml in zookeeper.

New Contributor

I'm following docs  to upload a csv file to solr.  My CDH version is 5.9.0. 

 

The morphline-file's spark locator I wrote is:

 

SOLR_LOCATOR : {
  # Name of solr collection
  collection : collection

  # ZooKeeper ensemble
  zkHost : "hadoop10.dev.clairvoyant.local"
}

 

 

I'm run the job using:

 

spark-submit \
  --master yarn \
  --deploy-mode client \
  --jars $myDependencyJarFiles \
  --executor-memory 500M \
  --conf "spark.executor.extraJavaOptions=$myJVMOptions" \
  --driver-java-options "$myJVMOptions" \
  --class org.apache.solr.crunch.CrunchIndexerTool \
  $myDriverJar \
  --morphline-file /root/data/loadSolrLine.conf \
  --pipeline-type spark \
  --chatty \
  /user/root/iouzipcodes2011.csv

 

 

 

And got the following error. 

 

 

Caused by: org.kitesdk.morphline.api.MorphlineCompilationException: Cannot download schema.xml from ZooKeeper near: {
# /data/0/yarn/nm/usercache/root/appcache/application_1487281871951_0004/container_1487281871951_0004_01_000002/tmp/org.apache.solr.crunch.MorphlineFnBuilder$MorphlineFn3206571973337697007.tmp: 3
# Name of solr collection
"collection" : "electric_collection",
# /data/0/yarn/nm/usercache/root/appcache/application_1487281871951_0004/container_1487281871951_0004_01_000002/tmp/org.apache.solr.crunch.MorphlineFnBuilder$MorphlineFn3206571973337697007.tmp: 6
# ZooKeeper ensemble
"zkHost" : "hadoop10.dev.clairvoyant.local:2181"
}

 

However, the schema.xml is in the zookeeper. (localhost is hadoop10.dev.clairvoyant.local, so I don't think is will be a issue).

 

 

[zk: localhost:2181(CONNECTED) 19] ls /solr/configs/electric_collection
[currency.xml, mapping-FoldToASCII.txt, protwords.txt, scripts.conf, synonyms.txt, stopwords.txt, _schema_analysis_synonyms_english.json, velocity, admin-extra.html, solrconfig.xml.secure, update-script.js, _schema_analysis_stopwords_english.json, solrconfig.xml, admin-extra.menu-top.html, elevate.xml, schema.xml, clustering, mapping-ISOLatin1Accent.txt, spellings.txt, _rest_managed.json, xslt, lang, admin-extra.menu-bottom.html]

 

So what I shall I provide to morphline so that it could find the schema.xml?

 

 

3 REPLIES 3

Re: Unable to find schema.xml in zookeeper.

Champion
I ran into this same error but nothing related to adding data or running jobs against Solr. It either couldn't start or CM kept failing to access the web server and api for its health check.

I didn't dig into it as it was a new cluster that was recently Kerberized and nothing was in Solr. I got it fixed by shutting down Solr and running the 'Initialize Solr' command in the Solr Service action menu in CM.

I can't attest to what that command does.

Good luck.

Re: Unable to find schema.xml in zookeeper.

New Contributor

Hi,

 

Did you find a solution for this? I am stuck at the same error.

 

Thanks

Re: Unable to find schema.xml in zookeeper.

Super Collaborator

Be sure to set the solr chroot in your zkHost definition:

 

zkHost : "hadoop10.dev.clairvoyant.local:2181/solr"

 

 

-pd