Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Spark - HDFS File access Error

avatar
Expert Contributor

I am trying to access a file in HDFS with the help of Spark Scala command in Spark Shell

Hadoop HDFS file URL is localhost:50070/sridhar/hadoop/sample.txt

I executed

scala> val file = sc.textFile("hdfs://localhost:50070/sridhar/hadoop/sample.txt")

then executed

scala> file.foreach(println) , got the following errors

java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: "sridhar25/127.0.1.1"; destination host is: "localhost":50070; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) at org.apache.hadoop.ipc.Client.call(Client.java:1472) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy33.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)

How to access the files without errors?

Thanks!

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Artem Ervits

In the hadoop core-site.xml file

I have given the hdfs url as

<configuration>
	<property>
		<name>fs.default.name</name>      
		<value>hdfs://localhost:9000</value>
 	</property>
</configuration>

After giving it as hdfs://localhost:9000/path it worked!

Thank you very much!

Correct command

Scala> val file= sc.textFile("hdfs://localhost:9000/path")

View solution in original post

5 REPLIES 5

avatar
Master Mentor

Can you try using hdfs://localhost:8020 or hdfs://machinname:8020

avatar
Expert Contributor

@Artem Ervits

I am getting the following errors

java.net.ConnectException: Call From sridhar25/127.0.1.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

avatar
Master Mentor

@Sridhar Babu M is your hdfs up? Please make sure your cluster is up and passes service checks.

avatar
Expert Contributor

@Artem Ervits

In the hadoop core-site.xml file

I have given the hdfs url as

<configuration>
	<property>
		<name>fs.default.name</name>      
		<value>hdfs://localhost:9000</value>
 	</property>
</configuration>

After giving it as hdfs://localhost:9000/path it worked!

Thank you very much!

Correct command

Scala> val file= sc.textFile("hdfs://localhost:9000/path")

avatar
Master Mentor

great you got it resolved.