Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Cannot get a file on HDFS becouse of "java.lang.ArrayIndexOutOfBoundsException"

avatar
Explorer

Hello.

 

MapReduce job couldn't start because a file cannot be readable.

When I tried to access the file, the following error happened.

 

 

org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException): java.lang.ArrayIndexOutOfBoundsException

    at org.apache.hadoop.ipc.Client.call(Client.java:1466)
    at org.apache.hadoop.ipc.Client.call(Client.java:1403)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
    at com.sun.proxy.$Proxy11.getListing(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:559)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
    at com.sun.proxy.$Proxy12.getListing(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2080)


 

My action.

 

(1) sudo -u hdfs hdfs fsck /

 fsck is stopped just in front of the error file. the result is "Failed"

/services/chikayo-dsp-bidder/click/hive/day=20170403/13.fluentd01.sv.infra.log 244412 bytes, 1 block(s):  OK
/services/chikayo-dsp-bidder/click/hive/day=20170403/13.fluentd02.sv.infra.log 282901 bytes, 1 block(s):  OK
/services/chikayo-dsp-bidder/click/hive/day=20170403/13.fluentd03.sv.infra.log 280334 bytes, 1 block(s):  OK
/services/chikayo-dsp-bidder/click/hive/day=20170403/14.fluentd01.sv.infra.log 258240 bytes, 1 block(s):  OK
FSCK ended at Mon Apr 03 18:16:08 JST 2017 in 3074 milliseconds
null


Fsck on path '/services/chikayo-dsp-bidder' FAILED

(2) sudo -u hdfs hdfs dfsadmin -report

Configured Capacity: 92383798755328 (84.02 TB)
Present Capacity: 89209585066072 (81.14 TB)
DFS Remaining: 19736633480052 (17.95 TB)
DFS Used: 69472951586020 (63.19 TB)
DFS Used%: 77.88%
Under replicated blocks: 0
Blocks with corrupt replicas: 2
Missing blocks: 0
Missing blocks (with replication factor 1): 0

Now, the error file is restored automatically. "Blocks with corrupt repilicas" is 0.

 

Question.

 

(1)Can I restore same error file manually ?

(2)What is the trigger by which restore is started ?

 

Thank you.

1 ACCEPTED SOLUTION

avatar
New Contributor

I ran into this issue myself. I was able to resolve it like this:

hadoop fs -setrep 2 /hdfs/path/to/file
hadoop fs -setrep 3 /hdfs/path/to/file

After changing the replication factor, I was able to access the file again.

View solution in original post

10 REPLIES 10

avatar
Expert Contributor

Just like to follow up. It was later determined to be caused by HDFS-11445.

The bug was fixed in CDH 5.12.2, CDH 5.13.1 or above.