Created 08-05-2015 03:58 PM
Hi,
We are using a CDH 5.4.3 kerberized cluster, secured via the Kerberos wizard in Cloudera Manager. We are trying to use Spark on YARN to write data to the Solr index using the SolrJ API. When Spark tries to write data to the Solr index we get the following error from the driver:
15/08/05 16:42:56.802 EDT WARN MYPROJECT TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1,
server9.labs.mydomain.com): org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Server
at http://server4.mydomain.com:8983/solr/MYPROJECT returned non ok status:401,
message:Unauthorized
at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:542)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:257)
at org.apache.solr.client.solrj.impl.HttpSolrServer.sendAuthenticatingRequestIfNecessary
(HttpSolrServer.java:888)
at org.apache.solr.client.solrj.impl.HttpSolrServer.createMethod(HttpSolrServer.java:365)
at org.apache.solr.client.solrj.impl.HttpSolrServer.createMethod(HttpSolrServer.java:304)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:229)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:225)
It appears that the Spark job does not have priviledges to write to the Solr index. The spark-submit job is being executed with the Spark principal. What else can we do to get this to work correctly? Does this version of CDH support Spark connections to a secure Solr?
Thanks for the help.
Created 12-31-2015 04:51 AM