I am trying to connect spark with salesforce using this library (https://github.com/springml/spark-salesforce) , its a small scala application to connect salesforce.
I am able to connect spark from my laptop but when I move the code (jar) to cluster I am getting exception, I am able to ping the salesforce URL after seeting up proxy but still unable to connect using spark.
the api owner says he was able to test on hortonworks cluster , does cloudera cluster lets applications to connect outside world (internet) from with in cluster ?
Exception while creating connection
com.sforce.ws.ConnectionException: Failed to send request to http
Nothing about a cluster would prevent it from making external connections, but your firewall rules might.
The variabbles you export here are not related to Spark. It's an error from the library you're using.
Below is my run book , I will try wtih jssecacerts and cacerts and let you know.
the library works fine on laptop or from outside cluster but not from inside cluster also as I said I can curl or wget on the url using http not with https is there a work around for this ?
spark-submit --class SalesForceTest3 --master local --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 /AZ/bin/myjar-1.0-SNAPSHOT-jar-with-dependencies.jar "https://test.salesforce.com/services/Soap/u/35.0" "yarn-client"
yes look like firewall rules might causing this issue, we were unable to find any connection log in salesforce application coming from Hadoop edge node we can see only succesfull log coming from IntelliJ Idea (windows PC).
Error on Edge Node:-
Caused by: java.net.ConnectException: Connection refused
looks like firewall is preventing the connection to go out of edge node need to check with our cloudera Hadoop Admin.