- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HDFS path does not exist with SparkSession object when spark master is set as LOCAL
- Labels:
-
Apache Spark
Created on ‎06-30-2017 04:36 AM - edited ‎09-16-2022 04:52 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to load a dataset into Hive table using Spark.
But when I try to load the file from HDFS directory to Spark, I get the exception:
org.apache.spark.sql.AnalysisException: Path does not exist: file:/home/cloudera/partfile;
These are the steps before loading the file.
val wareHouseLocation = "file:${system:user.dir}/spark-warehouse" val SparkSession = SparkSession.builder.master("spark://localhost:7077") \ .appName("SparkHive") \ .enableHiveSupport() \ .config("hive.exec.dynamic.partition", "true") \ .config("hive.exec.dynamic.partition.mode","nonstrict") \ .config("hive.metastore.warehouse.dir","/user/hive/warehouse") \ .config("spark.sql.warehouse.dir",wareHouseLocation).getOrCreate() import sparkSession.implicits._ val partf = sparkSession.read.textFile("partfile")
Exception for the statement ->
val partf = sparkSession.read.textFile("partfile") org.apache.spark.sql.AnalysisException: Path does not exist: file:/home/cloudera/partfile;
But I have the file in my home directory of HDFS.
hadoop fs -ls Found 1 items -rw-r--r-- 1 cloudera cloudera 58 2017-06-30 02:23 partfile
My spark version is 2.0.2
Could anyone tell me how to fix it ?
Created on ‎07-04-2017 11:58 PM - edited ‎07-05-2017 04:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What user are you runinig the spark ?
is this path that you are referring is in hdfs or local
/home/cloudera/partfile
perform this and let me know if the files are getting listed
hadoop fs -ls /home/cloudera/
