Created 12-14-2015 01:55 AM
Is spark-csv packages is not supported by HDP2.3.2? I am getting below error when I try to run spark-shell that spark-csv package is not supported.
[hdfs@sandbox root]$ spark-shell --packages com.databricks:spark-csv_2.10:1.1.0 --master yarn-client --driver-memory 512m --executor-memory 512m Ivy Default Cache set to: /home/hdfs/.ivy2/cache The jars for the packages stored in: /home/hdfs/.ivy2/jars :: loading settings :: url = jar:file:/usr/hdp/2.3.2.0-2950/spark/lib/spark-assembly-1.4.1.2.3.2.0-2950-hadoop2.7.1.2.3.2.0-2950.jar!/org/apache/ivy/core/settings/ivysettings.xml com.databricks#spark-csv_2.10 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 confs: [default] :: resolution report :: resolve 332ms :: artifacts dl 0ms :: modules in use: --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 1 | 0 | 0 | 0 || 0 | 0 | --------------------------------------------------------------------- :: problems summary :: :::: WARNINGS module not found: com.databricks#spark-csv_2.10;1.1.0 ==== local-m2-cache: tried file:/home/hdfs/.m2/repository/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.pom -- artifact com.databricks#spark-csv_2.10;1.1.0!spark-csv_2.10.jar: file:/home/hdfs/.m2/repository/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.jar ==== local-ivy-cache: tried /home/hdfs/.ivy2/local/com.databricks/spark-csv_2.10/1.1.0/ivys/ivy.xml ==== central: tried https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.pom -- artifact com.databricks#spark-csv_2.10;1.1.0!spark-csv_2.10.jar: https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.jar ==== spark-packages: tried http://dl.bintray.com/spark-packages/maven/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.... -- artifact com.databricks#spark-csv_2.10;1.1.0!spark-csv_2.10.jar: http://dl.bintray.com/spark-packages/maven/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.... :::::::::::::::::::::::::::::::::::::::::::::: :: UNRESOLVED DEPENDENCIES :: :::::::::::::::::::::::::::::::::::::::::::::: :: com.databricks#spark-csv_2.10;1.1.0: not found :::::::::::::::::::::::::::::::::::::::::::::: :::: ERRORS Server access error at url https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.pom (java.net.ConnectException: Connection refused) Server access error at url https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.jar (java.net.ConnectException: Connection refused) :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: com.databricks#spark-csv_2.10;1.1.0: not found] at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:995) at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:263) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:145) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 15/12/14 01:49:39 INFO Utils: Shutdown hook called [hdfs@sandbox root]$
Would really appreciate your help.
Created on 12-14-2015 03:03 AM - edited 08-19-2019 05:37 AM
Server access error at url https://repo1.maven.org/maven2/com/databricks/spa... (java.net.ConnectException: Connection refused)
Please see those messages in your output.
The same statement worked for me in my sandbox HDP 2.3.2
Output attached.
Created on 12-14-2015 03:03 AM - edited 08-19-2019 05:37 AM
Server access error at url https://repo1.maven.org/maven2/com/databricks/spa... (java.net.ConnectException: Connection refused)
Please see those messages in your output.
The same statement worked for me in my sandbox HDP 2.3.2
Output attached.
Created 12-14-2015 03:13 AM
Thanks alot for the prompt response.
I am using HDP2.3.2 Vmware version(Link) . Is there any workaround to make it work?
Created 12-18-2015 06:34 AM
I encountered the issue I had enabled Bridge network connection in my VMWare because of which it was not installing the spark-csv packages and I was getting (java.net.ConnectException: Connection refused) .
Created 12-19-2015 05:18 PM
if its at networking, just download the JAR file yourself, and use the --jars option to add it to the classpath.
looks like it lives under https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/