Support Questions

Find answers, ask questions, and share your expertise

HDFS path does not exist with SparkSession object when spark master is set as LOCAL

avatar
Explorer

I am trying to load a dataset into Hive table using Spark.

But when I try to load the file from HDFS directory to Spark, I get the exception:

 

org.apache.spark.sql.AnalysisException: Path does not exist: file:/home/cloudera/partfile;

These are the steps before loading the file.

 

val wareHouseLocation = "file:${system:user.dir}/spark-warehouse"
val SparkSession = SparkSession.builder.master("spark://localhost:7077") \
    .appName("SparkHive") \
    .enableHiveSupport() \
    .config("hive.exec.dynamic.partition", "true") \
    .config("hive.exec.dynamic.partition.mode","nonstrict") \
    .config("hive.metastore.warehouse.dir","/user/hive/warehouse") \
    .config("spark.sql.warehouse.dir",wareHouseLocation).getOrCreate()
import sparkSession.implicits._
val partf = sparkSession.read.textFile("partfile")

Exception for the statement ->

val partf = sparkSession.read.textFile("partfile")

org.apache.spark.sql.AnalysisException: Path does not exist: file:/home/cloudera/partfile;

But I have the file in my home directory of HDFS.

hadoop fs -ls
Found 1 items
-rw-r--r--   1 cloudera cloudera         58 2017-06-30 02:23 partfile

My spark version is 2.0.2

Could anyone tell me how to fix it ?

1 REPLY 1

avatar
Champion

What user are you runinig the spark ? 

 

is this path that you are referring is in  hdfs or local 

 

/home/cloudera/partfile

 

 

perform this and let me know if the files are getting listed 

 

hadoop fs -ls /home/cloudera/