02-17-2017 08:33 AM - edited 02-17-2017 08:35 AM
I'm following docs to upload a csv file to solr. My CDH version is 5.9.0.
The morphline-file's spark locator I wrote is:
SOLR_LOCATOR : { # Name of solr collection collection : collection # ZooKeeper ensemble zkHost : "hadoop10.dev.clairvoyant.local" }
I'm run the job using:
spark-submit \ --master yarn \ --deploy-mode client \ --jars $myDependencyJarFiles \ --executor-memory 500M \ --conf "spark.executor.extraJavaOptions=$myJVMOptions" \ --driver-java-options "$myJVMOptions" \ --class org.apache.solr.crunch.CrunchIndexerTool \ $myDriverJar \ --morphline-file /root/data/loadSolrLine.conf \ --pipeline-type spark \ --chatty \ /user/root/iouzipcodes2011.csv
And got the following error.
Caused by: org.kitesdk.morphline.api.MorphlineCompilationException: Cannot download schema.xml from ZooKeeper near: { # /data/0/yarn/nm/usercache/root/appcache/application_1487281871951_0004/container_1487281871951_0004_01_000002/tmp/org.apache.solr.crunch.MorphlineFnBuilder$MorphlineFn3206571973337697007.tmp: 3 # Name of solr collection "collection" : "electric_collection", # /data/0/yarn/nm/usercache/root/appcache/application_1487281871951_0004/container_1487281871951_0004_01_000002/tmp/org.apache.solr.crunch.MorphlineFnBuilder$MorphlineFn3206571973337697007.tmp: 6 # ZooKeeper ensemble "zkHost" : "hadoop10.dev.clairvoyant.local:2181" }
However, the schema.xml is in the zookeeper. (localhost is hadoop10.dev.clairvoyant.local, so I don't think is will be a issue).
[zk: localhost:2181(CONNECTED) 19] ls /solr/configs/electric_collection [currency.xml, mapping-FoldToASCII.txt, protwords.txt, scripts.conf, synonyms.txt, stopwords.txt, _schema_analysis_synonyms_english.json, velocity, admin-extra.html, solrconfig.xml.secure, update-script.js, _schema_analysis_stopwords_english.json, solrconfig.xml, admin-extra.menu-top.html, elevate.xml, schema.xml, clustering, mapping-ISOLatin1Accent.txt, spellings.txt, _rest_managed.json, xslt, lang, admin-extra.menu-bottom.html]
So what I shall I provide to morphline so that it could find the schema.xml?
02-17-2017 07:08 PM
03-01-2017 09:45 AM
Hi,
Did you find a solution for this? I am stuck at the same error.
Thanks
03-01-2017 10:17 AM
Be sure to set the solr chroot in your zkHost definition:
zkHost : "hadoop10.dev.clairvoyant.local:2181/solr"
-pd