I have a scenario where I need to read files from windows shared path using spark and scala. I tried with below but could not able to find the files:object ExternalFiles {
def main(args: Array[String]){
val conf = new SparkConf().setMaster("local").setAppName("External Files")
val sc = new SparkContext(conf)
val files = sc.textFile("\\\\sharedNetwork\\External Data\\testData.txt")
files.foreach(println)
}
}
I tried using sc.textFile("file://sharedNetwork/External Data/testData.txt") but it shows below error in both cases:
18/12/23 11:57:57 WARN : Your hostname, name-21 resolves to a loopback/non-reachable address: 10.xx.xx.xxx, but we couldn't find any external IP address!
Exception in thread "main" org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file://sharedNetwork/External Data/testData.txt
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:251)
Can someone suggest me to resolve this solution to read files from shared drive using Spark and Scala code.
Pls: suggest me on how to download files from NTFS windows shared path to linux machine through putty.
Thanks,
Chaitanya