Created 12-14-2015 01:55 AM
Is spark-csv packages is not supported by HDP2.3.2? I am getting below error when I try to run spark-shell that spark-csv package is not supported.
[hdfs@sandbox root]$ spark-shell --packages com.databricks:spark-csv_2.10:1.1.0 --master yarn-client --driver-memory 512m --executor-memory 512m
Ivy Default Cache set to: /home/hdfs/.ivy2/cache
The jars for the packages stored in: /home/hdfs/.ivy2/jars
:: loading settings :: url = jar:file:/usr/hdp/2.3.2.0-2950/spark/lib/spark-assembly-1.4.1.2.3.2.0-2950-hadoop2.7.1.2.3.2.0-2950.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.databricks#spark-csv_2.10 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
confs: [default]
:: resolution report :: resolve 332ms :: artifacts dl 0ms
:: modules in use:
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 1 | 0 | 0 | 0 || 0 | 0 |
---------------------------------------------------------------------
:: problems summary ::
:::: WARNINGS
module not found: com.databricks#spark-csv_2.10;1.1.0
==== local-m2-cache: tried
file:/home/hdfs/.m2/repository/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.pom
-- artifact com.databricks#spark-csv_2.10;1.1.0!spark-csv_2.10.jar:
file:/home/hdfs/.m2/repository/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.jar
==== local-ivy-cache: tried
/home/hdfs/.ivy2/local/com.databricks/spark-csv_2.10/1.1.0/ivys/ivy.xml
==== central: tried
https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.pom
-- artifact com.databricks#spark-csv_2.10;1.1.0!spark-csv_2.10.jar:
https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.jar
==== spark-packages: tried
http://dl.bintray.com/spark-packages/maven/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0....
-- artifact com.databricks#spark-csv_2.10;1.1.0!spark-csv_2.10.jar:
http://dl.bintray.com/spark-packages/maven/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0....
::::::::::::::::::::::::::::::::::::::::::::::
:: UNRESOLVED DEPENDENCIES ::
::::::::::::::::::::::::::::::::::::::::::::::
:: com.databricks#spark-csv_2.10;1.1.0: not found
::::::::::::::::::::::::::::::::::::::::::::::
:::: ERRORS
Server access error at url https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.pom (java.net.ConnectException: Connection refused)
Server access error at url https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.jar (java.net.ConnectException: Connection refused)
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: com.databricks#spark-csv_2.10;1.1.0: not found]
at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:995)
at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:263)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:145)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/12/14 01:49:39 INFO Utils: Shutdown hook called
[hdfs@sandbox root]$
Would really appreciate your help.
Created on 12-14-2015 03:03 AM - edited 08-19-2019 05:37 AM
Server access error at url https://repo1.maven.org/maven2/com/databricks/spa... (java.net.ConnectException: Connection refused)
Please see those messages in your output.
The same statement worked for me in my sandbox HDP 2.3.2
Output attached.
Created on 12-14-2015 03:03 AM - edited 08-19-2019 05:37 AM
Server access error at url https://repo1.maven.org/maven2/com/databricks/spa... (java.net.ConnectException: Connection refused)
Please see those messages in your output.
The same statement worked for me in my sandbox HDP 2.3.2
Output attached.
Created 12-14-2015 03:13 AM
Thanks alot for the prompt response.
I am using HDP2.3.2 Vmware version(Link) . Is there any workaround to make it work?
Created 12-18-2015 06:34 AM
I encountered the issue I had enabled Bridge network connection in my VMWare because of which it was not installing the spark-csv packages and I was getting (java.net.ConnectException: Connection refused) .
Created 12-19-2015 05:18 PM
if its at networking, just download the JAR file yourself, and use the --jars option to add it to the classpath.
looks like it lives under https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/