Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Cannot get a file on HDFS becouse of "java.lang.ArrayIndexOutOfBoundsException"

avatar
Explorer

Hello.

 

MapReduce job couldn't start because a file cannot be readable.

When I tried to access the file, the following error happened.

 

 

org.apache.hadoop.ipc.RemoteException(java.lang.ArrayIndexOutOfBoundsException): java.lang.ArrayIndexOutOfBoundsException

    at org.apache.hadoop.ipc.Client.call(Client.java:1466)
    at org.apache.hadoop.ipc.Client.call(Client.java:1403)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
    at com.sun.proxy.$Proxy11.getListing(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:559)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
    at com.sun.proxy.$Proxy12.getListing(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:2080)


 

My action.

 

(1) sudo -u hdfs hdfs fsck /

 fsck is stopped just in front of the error file. the result is "Failed"

/services/chikayo-dsp-bidder/click/hive/day=20170403/13.fluentd01.sv.infra.log 244412 bytes, 1 block(s):  OK
/services/chikayo-dsp-bidder/click/hive/day=20170403/13.fluentd02.sv.infra.log 282901 bytes, 1 block(s):  OK
/services/chikayo-dsp-bidder/click/hive/day=20170403/13.fluentd03.sv.infra.log 280334 bytes, 1 block(s):  OK
/services/chikayo-dsp-bidder/click/hive/day=20170403/14.fluentd01.sv.infra.log 258240 bytes, 1 block(s):  OK
FSCK ended at Mon Apr 03 18:16:08 JST 2017 in 3074 milliseconds
null


Fsck on path '/services/chikayo-dsp-bidder' FAILED

(2) sudo -u hdfs hdfs dfsadmin -report

Configured Capacity: 92383798755328 (84.02 TB)
Present Capacity: 89209585066072 (81.14 TB)
DFS Remaining: 19736633480052 (17.95 TB)
DFS Used: 69472951586020 (63.19 TB)
DFS Used%: 77.88%
Under replicated blocks: 0
Blocks with corrupt replicas: 2
Missing blocks: 0
Missing blocks (with replication factor 1): 0

Now, the error file is restored automatically. "Blocks with corrupt repilicas" is 0.

 

Question.

 

(1)Can I restore same error file manually ?

(2)What is the trigger by which restore is started ?

 

Thank you.

1 ACCEPTED SOLUTION

avatar
Visitor

I ran into this issue myself. I was able to resolve it like this:

hadoop fs -setrep 2 /hdfs/path/to/file
hadoop fs -setrep 3 /hdfs/path/to/file

After changing the replication factor, I was able to access the file again.

View solution in original post

10 REPLIES 10

avatar
Expert Contributor

Just like to follow up. It was later determined to be caused by HDFS-11445.

The bug was fixed in CDH 5.12.2, CDH 5.13.1 or above.