Created 12-14-2015 01:55 AM
Is spark-csv packages is not supported by HDP2.3.2? I am getting below error when I try to run spark-shell that spark-csv package is not supported.
[hdfs@sandbox root]$ spark-shell   --packages com.databricks:spark-csv_2.10:1.1.0  --master yarn-client --driver-memory 512m --executor-memory 512m
Ivy Default Cache set to: /home/hdfs/.ivy2/cache
The jars for the packages stored in: /home/hdfs/.ivy2/jars
:: loading settings :: url = jar:file:/usr/hdp/2.3.2.0-2950/spark/lib/spark-assembly-1.4.1.2.3.2.0-2950-hadoop2.7.1.2.3.2.0-2950.jar!/org/apache/ivy/core/settings/ivysettings.xml
com.databricks#spark-csv_2.10 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
        confs: [default]
:: resolution report :: resolve 332ms :: artifacts dl 0ms
        :: modules in use:
        ---------------------------------------------------------------------
        |                  |            modules            ||   artifacts   |
        |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
        ---------------------------------------------------------------------
        |      default     |   1   |   0   |   0   |   0   ||   0   |   0   |
        ---------------------------------------------------------------------
:: problems summary ::
:::: WARNINGS
                module not found: com.databricks#spark-csv_2.10;1.1.0
        ==== local-m2-cache: tried
          file:/home/hdfs/.m2/repository/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.pom
          -- artifact com.databricks#spark-csv_2.10;1.1.0!spark-csv_2.10.jar:
          file:/home/hdfs/.m2/repository/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.jar
        ==== local-ivy-cache: tried
          /home/hdfs/.ivy2/local/com.databricks/spark-csv_2.10/1.1.0/ivys/ivy.xml
        ==== central: tried
          https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.pom
          -- artifact com.databricks#spark-csv_2.10;1.1.0!spark-csv_2.10.jar:
          https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.jar
        ==== spark-packages: tried
          http://dl.bintray.com/spark-packages/maven/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0....
          -- artifact com.databricks#spark-csv_2.10;1.1.0!spark-csv_2.10.jar:
          http://dl.bintray.com/spark-packages/maven/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0....
                ::::::::::::::::::::::::::::::::::::::::::::::
                ::          UNRESOLVED DEPENDENCIES         ::
                ::::::::::::::::::::::::::::::::::::::::::::::
                :: com.databricks#spark-csv_2.10;1.1.0: not found
                ::::::::::::::::::::::::::::::::::::::::::::::
:::: ERRORS
        Server access error at url https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.pom (java.net.ConnectException: Connection refused)
        Server access error at url https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/spark-csv_2.10-1.1.0.jar (java.net.ConnectException: Connection refused)
:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [unresolved dependency: com.databricks#spark-csv_2.10;1.1.0: not found]
        at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:995)
        at org.apache.spark.deploy.SparkSubmit$.prepareSubmitEnvironment(SparkSubmit.scala:263)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:145)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/12/14 01:49:39 INFO Utils: Shutdown hook called
[hdfs@sandbox root]$
Would really appreciate your help.
Created on 12-14-2015 03:03 AM - edited 08-19-2019 05:37 AM
Server access error at url https://repo1.maven.org/maven2/com/databricks/spa... (java.net.ConnectException: Connection refused)
Please see those messages in your output.
The same statement worked for me in my sandbox HDP 2.3.2
Output attached.
Created on 12-14-2015 03:03 AM - edited 08-19-2019 05:37 AM
Server access error at url https://repo1.maven.org/maven2/com/databricks/spa... (java.net.ConnectException: Connection refused)
Please see those messages in your output.
The same statement worked for me in my sandbox HDP 2.3.2
Output attached.
Created 12-14-2015 03:13 AM
Thanks alot for the prompt response.
I am using HDP2.3.2 Vmware version(Link) . Is there any workaround to make it work?
Created 12-18-2015 06:34 AM
I encountered the issue I had enabled Bridge network connection in my VMWare because of which it was not installing the spark-csv packages and I was getting (java.net.ConnectException: Connection refused) .
Created 12-19-2015 05:18 PM
if its at networking, just download the JAR file yourself, and use the --jars option to add it to the classpath.
looks like it lives under https://repo1.maven.org/maven2/com/databricks/spark-csv_2.10/1.1.0/