Support Questions

Find answers, ask questions, and share your expertise

Failed to open HDFS file

avatar
Contributor

Hi Folks ,

Running workflow with impala shell. got error

 

ERROR: Disk I/O error: Failed to open HDFS file

Disabled invalidate metadata

hdfs:///batch_id=NWMISSPAYWRADJ/aa4fbef1c0bb3fd5-85012b8600000018_1953707135_data.0.parq
Error(2): No such file or directory
Root cause: RemoteException: File does not exist 

/app/abc/footable/tablename/account/orgid=abcd/batch_id=NWMISSPAYWRADJ/aa4fbef1c0bb3fd5-85012b8600000018_1953707135_data.0.parq
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2157)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2127)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2040)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:583)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:94)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)

Thanks,

 

 

 

4 REPLIES 4

avatar
Contributor

What version you've? Can you see if something was deleted in the audit logs?

 

Also try to restart impala services to see if this is resolved.

avatar
Contributor

CDH version is 5.15.1

impala version :impalad version 2.12.0

audit logs in which location ?

provide steps  please.

impala cant restart as its a prod .

 

Thanks,

syed

avatar
Contributor

If you've the audits logs configured you can see in Cloudera Navigator, more information here.

 

Looks like something was removed from the directory, I recommend check if this is the case in the logs or check what happened with the file that's giving the error: /app/abc/footable/tablename/account/orgid=abcd/batch_id=NWMISSPAYWRADJ/aa4fbef1c0bb3fd5-85012b8600000018_1953707135_data.0.parq

 

Check if this file exists in HDFS, if it was removed externally that's the error cause and you can solve restarting Impala Service when possible.

avatar
Expert Contributor

Hello @syedshakir 

 

Are you inserting data into the table externally of Impala (that is via hive, sqoop, spark etc).

If yes, the Impala may not be aware of the newly added files and running invalidate metadata/refresh on the table may fix your issue.