Support Questions

zing · ‎04-04-2017

Hello.

MapReduce job couldn't start because a file cannot be readable.

When I tried to access the file, the following error happened.

org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException): java.lang.ArrayIndexOutOfBoundsException

    at org.apache.hadoop.ipc.Client.call(Client.java:1466)
    at org.apache.hadoop.ipc.Client.call(Client.java:1403)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
    at com.sun.proxy.$Proxy11.getListing(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:559)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
    at com.sun.proxy.$Proxy12.getListing(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2080)

My action.

(1) sudo -u hdfs hdfs fsck /

　fsck is stopped just in front of the error file. the result is "Failed"

/services/chikayo-dsp-bidder/click/hive/day=20170403/13.fluentd01.sv.infra.log 244412 bytes, 1 block(s):  OK
/services/chikayo-dsp-bidder/click/hive/day=20170403/13.fluentd02.sv.infra.log 282901 bytes, 1 block(s):  OK
/services/chikayo-dsp-bidder/click/hive/day=20170403/13.fluentd03.sv.infra.log 280334 bytes, 1 block(s):  OK
/services/chikayo-dsp-bidder/click/hive/day=20170403/14.fluentd01.sv.infra.log 258240 bytes, 1 block(s):  OK
FSCK ended at Mon Apr 03 18:16:08 JST 2017 in 3074 milliseconds
null


Fsck on path '/services/chikayo-dsp-bidder' FAILED

(2) sudo -u hdfs hdfs dfsadmin -report

Configured Capacity: 92383798755328 (84.02 TB)
Present Capacity: 89209585066072 (81.14 TB)
DFS Remaining: 19736633480052 (17.95 TB)
DFS Used: 69472951586020 (63.19 TB)
DFS Used%: 77.88%
Under replicated blocks: 0
Blocks with corrupt replicas: 2
Missing blocks: 0
Missing blocks (with replication factor 1): 0

Now, the error file is restored automatically. "Blocks with corrupt repilicas" is 0.

Question.

(1)Can I restore same error file manually ?

(2)What is the trigger by which restore is started ?

Thank you.

ernieg3 · ‎08-24-2017

I ran into this issue myself. I was able to resolve it like this:

hadoop fs -setrep 2 /hdfs/path/to/file
hadoop fs -setrep 3 /hdfs/path/to/file

After changing the replication factor, I was able to access the file again.

View solution in original post

weichiu · ‎07-06-2018

Just like to follow up. It was later determined to be caused by HDFS-11445.

The bug was fixed in CDH 5.12.2, CDH 5.13.1 or above.

Cloudera Community

Support Questions

Cannot get a file on HDFS becouse of "java.lang.ArrayIndexOutOfBoundsException"