Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Spark - HDFS File access Error

avatar
Expert Contributor

I am trying to access a file in HDFS with the help of Spark Scala command in Spark Shell

Hadoop HDFS file URL is localhost:50070/sridhar/hadoop/sample.txt

I executed

scala> val file = sc.textFile("hdfs://localhost:50070/sridhar/hadoop/sample.txt")

then executed

scala> file.foreach(println) , got the following errors

java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message end-group tag did not match expected tag.; Host Details : local host is: "sridhar25/127.0.1.1"; destination host is: "localhost":50070; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772) at org.apache.hadoop.ipc.Client.call(Client.java:1472) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy33.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)

How to access the files without errors?

Thanks!

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Artem Ervits

In the hadoop core-site.xml file

I have given the hdfs url as

<configuration>
	<property>
		<name>fs.default.name</name>      
		<value>hdfs://localhost:9000</value>
 	</property>
</configuration>

After giving it as hdfs://localhost:9000/path it worked!

Thank you very much!

Correct command

Scala> val file= sc.textFile("hdfs://localhost:9000/path")

View solution in original post

5 REPLIES 5

avatar
Master Mentor

Can you try using hdfs://localhost:8020 or hdfs://machinname:8020

avatar
Expert Contributor

@Artem Ervits

I am getting the following errors

java.net.ConnectException: Call From sridhar25/127.0.1.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

avatar
Master Mentor

@Sridhar Babu M is your hdfs up? Please make sure your cluster is up and passes service checks.

avatar
Expert Contributor

@Artem Ervits

In the hadoop core-site.xml file

I have given the hdfs url as

<configuration>
	<property>
		<name>fs.default.name</name>      
		<value>hdfs://localhost:9000</value>
 	</property>
</configuration>

After giving it as hdfs://localhost:9000/path it worked!

Thank you very much!

Correct command

Scala> val file= sc.textFile("hdfs://localhost:9000/path")

avatar
Master Mentor

great you got it resolved.