Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HDFS path does not exist with SparkSession object when spark master is set as LOCAL

avatar
Explorer

I am trying to load a dataset into Hive table using Spark.

But when I try to load the file from HDFS directory to Spark, I get the exception:

 

org.apache.spark.sql.AnalysisException: Path does not exist: file:/home/cloudera/partfile;

These are the steps before loading the file.

 

val wareHouseLocation = "file:${system:user.dir}/spark-warehouse"
val SparkSession = SparkSession.builder.master("spark://localhost:7077") \
    .appName("SparkHive") \
    .enableHiveSupport() \
    .config("hive.exec.dynamic.partition", "true") \
    .config("hive.exec.dynamic.partition.mode","nonstrict") \
    .config("hive.metastore.warehouse.dir","/user/hive/warehouse") \
    .config("spark.sql.warehouse.dir",wareHouseLocation).getOrCreate()
import sparkSession.implicits._
val partf = sparkSession.read.textFile("partfile")

Exception for the statement ->

val partf = sparkSession.read.textFile("partfile")

org.apache.spark.sql.AnalysisException: Path does not exist: file:/home/cloudera/partfile;

But I have the file in my home directory of HDFS.

hadoop fs -ls
Found 1 items
-rw-r--r--   1 cloudera cloudera         58 2017-06-30 02:23 partfile

My spark version is 2.0.2

Could anyone tell me how to fix it ?

1 REPLY 1

avatar
Champion

What user are you runinig the spark ? 

 

is this path that you are referring is in  hdfs or local 

 

/home/cloudera/partfile

 

 

perform this and let me know if the files are getting listed 

 

hadoop fs -ls /home/cloudera/