Support Questions
Find answers, ask questions, and share your expertise

spark-csv on spark 1.6



I need to use spark-csv on spark 1.6, any one have an idea if I need to copy the spark-csv jar on all spark node and where.

I have to use zeeplkin and use livy interpretor


Expert Contributor

@Boualem SAOULA Here's how you can add it so you can work with it:

The --packages will also work with 1.6.


Thanks Matt, but my servers cannot access to the internet,

I donwload the sapark-csv jar and copy it to spark server, I'm looking to know how does spark locate the folder which contain the saprk-csv JAR !!!


You can choose to either compile the package into your application jar, or manually install it on every spark/yarn worker node and include the dir in your <extraClassPath>.

Sample pom.xml on HDP 2.6.3:


Use "


" if you choose external installation. Leave out if you want to compile in. Simpler to compile in, but if you have a large cluster or multiple Spark applications that will share such external libraries, using "provided" scope may be more optimal. In this case, you would need to specify:

--conf "spark.driver.extraClassPath=...:<your ext lib path>/*" --conf "spark.executor.extraClassPath=...:<your ext lib path>/*"

on your spark-submit command line.

Expert Contributor

@Boualem SAOULA I agree with what @Miles Yao. If you wanted a quick method to test or just add some jars quickly there is also a spark-submit parameter --jars that takes a comma separated list of (full path to) Jars. But it ships the jars every time so that's why the method @Miles Yao suggested has some extra benefit as you save on network traffic.


My solution is to add spark.jars Property to spark 1.6 config.

spark.jars='path-to-jar' (you can use any path)

and I copy spark-csv jar (spark-csv_2.10-1.5.0.jar) and its dependancy (commons-csv-1.1.jar, univocity-parsers-1.5.1.jar) to the path-to-jar