Created on 08-06-2020 09:41 AM - last edited on 08-06-2020 02:15 PM by cjervis
Hi Folks ,
Running workflow with impala shell. got error
ERROR: Disk I/O error: Failed to open HDFS file
Disabled invalidate metadata
hdfs:///batch_id=NWMISSPAYWRADJ/aa4fbef1c0bb3fd5-85012b8600000018_1953707135_data.0.parq
Error(2): No such file or directory
Root cause: RemoteException: File does not exist
/app/abc/footable/tablename/account/orgid=abcd/batch_id=NWMISSPAYWRADJ/aa4fbef1c0bb3fd5-85012b8600000018_1953707135_data.0.parq
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2157)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2127)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2040)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:583)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:94)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:377)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2281)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2277)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2275)
Thanks,
Created 08-06-2020 10:02 AM
What version you've? Can you see if something was deleted in the audit logs?
Also try to restart impala services to see if this is resolved.
Created 08-06-2020 10:35 AM
CDH version is 5.15.1
impala version :impalad version 2.12.0
audit logs in which location ?
provide steps please.
impala cant restart as its a prod .
Thanks,
syed
Created 08-06-2020 11:58 AM
If you've the audits logs configured you can see in Cloudera Navigator, more information here.
Looks like something was removed from the directory, I recommend check if this is the case in the logs or check what happened with the file that's giving the error: /app/abc/footable/tablename/account/orgid=abcd/batch_id=NWMISSPAYWRADJ/aa4fbef1c0bb3fd5-85012b8600000018_1953707135_data.0.parq
Check if this file exists in HDFS, if it was removed externally that's the error cause and you can solve restarting Impala Service when possible.
Created 10-06-2020 08:57 AM
Hello @syedshakir
Are you inserting data into the table externally of Impala (that is via hive, sqoop, spark etc).
If yes, the Impala may not be aware of the newly added files and running invalidate metadata/refresh on the table may fix your issue.