Support Questions
Find answers, ask questions, and share your expertise

Spark cannot access Secure Solr using SolrJ API

Spark cannot access Secure Solr using SolrJ API

New Contributor

Hi,

We are using a CDH 5.4.3 kerberized cluster, secured via the Kerberos wizard in Cloudera Manager. We are trying to use Spark on YARN to write data to the Solr index using the SolrJ API. When Spark tries to write data to the Solr index we get the following error from the driver:

15/08/05 16:42:56.802 EDT WARN MYPROJECT TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1,
server9.labs.mydomain.com): org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Server
at http://server4.mydomain.com:8983/solr/MYPROJECT returned non ok status:401,
message:Unauthorized
    at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:542)
    at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:257)
    at org.apache.solr.client.solrj.impl.HttpSolrServer.sendAuthenticatingRequestIfNecessary
(HttpSolrServer.java:888)
    at org.apache.solr.client.solrj.impl.HttpSolrServer.createMethod(HttpSolrServer.java:365)
    at org.apache.solr.client.solrj.impl.HttpSolrServer.createMethod(HttpSolrServer.java:304)
    at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:229)
    at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:225)

It appears that the Spark job does not have priviledges to write to the Solr index. The spark-submit job is being executed with the Spark principal. What else can we do to get this to work correctly? Does this version of CDH support Spark connections to a secure Solr?

Thanks for the help.

1 REPLY 1

Re: Spark cannot access Secure Solr using SolrJ API

Expert Contributor
it appears that there's a certain configuration in your network that restricts the access , solr is fully accessible via solrj from spark