@Fish Berh
This could have due to a problem with the spark-csv jar. i have encountered this myself and I found a solution which I cannot find now.
Here are my notes at the time:
1. Create a folder in your local OS or HDFS and place the proper versions for your case of the jars here (replace ? with your version needed):
- spark-csv_?.jar
- commons-csv-?.jar
- univocity-parsers-?.jar
2. Go to your /conf directory where you have installed Spark and in the spark-defaults.conf file add the line:
spark.driver.extraClassPath D:/Spark/spark_jars/*
The asterisk should include all the jars. Now run Python, create SparkContext, SQLContext as you normally would. Now you should be able to use spark-csv as
sqlContext.read.format('com.databricks.spark.csv').\
options(header='true', inferschema='true').\
load('foobar.csv')