Support Questions

Find answers, ask questions, and share your expertise

what could be the cause for spark2-hdp-yarn-archive.tar.gz corruption

avatar

hi all,

we installed new hadoop cluster ( ambari + HDP version 2.6.4 )

after installation , we notice that we have problem with the spark-submit

and finally we found that spark2-hdp-yarn-archive.tar.gz file is corruption

full path - /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz ( from HDFS )

my question is - what could be the reason that this is is corrupted ?

in spite this cluster is new fresh installation

Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Michael Bronson

As the file path which you shared is on HDFS : /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz

To identify "corrupt" or "missing" blocks, the command-line command can be used to knwo whether it is healthy or not?

# su - hdfs -c "hdfs fsck /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz"
.
Connecting to namenode via http://hdfcluster2.example.com:50070/fsck?ugi=hdfs&path=%2Fhdp%2Fapps%2F2.6.4.0-91%2Fspark2%2Fspark2...
FSCK started by hdfs (auth:SIMPLE) from /172.22.197.159 for path /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz at Wed Sep 05 01:51:25 UTC 2018
.Status: HEALTHY
 Total size:    189997800 B
 Total dirs:    0
 Total files:    1
 Total symlinks:        0
 Total blocks (validated):    2 (avg. block size 94998900 B)
 Minimally replicated blocks:    2 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:    0 (0.0 %)
 Mis-replicated blocks:        0 (0.0 %)
 Default replication factor:    3
 Average block replication:    3.0
 Corrupt blocks:        0
 Missing replicas:        0 (0.0 %)
 Number of data-nodes:        4
 Number of racks:        1
FSCK ended at Wed Sep 05 01:51:25 UTC 2018 in 35 milliseconds
The filesystem under path '/hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz' is HEALTHY

HDFS will attempt to recover the situation automatically. By default there are three replicas of any block in the cluster. so if HDFS detects that one replica of a block has become corrupt or damaged, HDFS will create a new replica of that block from a known-good replica, and will mark the damaged one for deletion.

The chances of three replicas of the same block becoming damaged is so remote that it would suggest a significant failure somewhere else in the cluster. If this situation does occur, and all three replicas are damaged, then 'hdfs fsck' will report that block as "corrupt" - i.e. HDFS cannot self-heal the block from any of its replicas.


Although there are some Articles which can be referred to fix the "Under replicated Blocks" like:
https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html

How to fix missing/corrupted/under or over-replicated blocks?
https://community.hortonworks.com/content/supportkb/49106/how-to-fix-missingcorruptedunder-or-over-r...

.

View solution in original post

6 REPLIES 6

avatar

@Michael Bronson

What kind of corruption is that? file is incomplete or less size that it should be?

avatar

I cant tell you exaclty but after I tar again the files , this solve my problem

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

As the file path which you shared is on HDFS : /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz

To identify "corrupt" or "missing" blocks, the command-line command can be used to knwo whether it is healthy or not?

# su - hdfs -c "hdfs fsck /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz"
.
Connecting to namenode via http://hdfcluster2.example.com:50070/fsck?ugi=hdfs&path=%2Fhdp%2Fapps%2F2.6.4.0-91%2Fspark2%2Fspark2...
FSCK started by hdfs (auth:SIMPLE) from /172.22.197.159 for path /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz at Wed Sep 05 01:51:25 UTC 2018
.Status: HEALTHY
 Total size:    189997800 B
 Total dirs:    0
 Total files:    1
 Total symlinks:        0
 Total blocks (validated):    2 (avg. block size 94998900 B)
 Minimally replicated blocks:    2 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:    0 (0.0 %)
 Mis-replicated blocks:        0 (0.0 %)
 Default replication factor:    3
 Average block replication:    3.0
 Corrupt blocks:        0
 Missing replicas:        0 (0.0 %)
 Number of data-nodes:        4
 Number of racks:        1
FSCK ended at Wed Sep 05 01:51:25 UTC 2018 in 35 milliseconds
The filesystem under path '/hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz' is HEALTHY

HDFS will attempt to recover the situation automatically. By default there are three replicas of any block in the cluster. so if HDFS detects that one replica of a block has become corrupt or damaged, HDFS will create a new replica of that block from a known-good replica, and will mark the damaged one for deletion.

The chances of three replicas of the same block becoming damaged is so remote that it would suggest a significant failure somewhere else in the cluster. If this situation does occur, and all three replicas are damaged, then 'hdfs fsck' will report that block as "corrupt" - i.e. HDFS cannot self-heal the block from any of its replicas.


Although there are some Articles which can be referred to fix the "Under replicated Blocks" like:
https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html

How to fix missing/corrupted/under or over-replicated blocks?
https://community.hortonworks.com/content/supportkb/49106/how-to-fix-missingcorruptedunder-or-over-r...

.

avatar

@Jay , very nice solution

until now I was doing this , in ordeer to verify the file

gzip -t /var/tmp/spark2-hdp-yarn-archive.tar.gz
gunzip -c /var/tmp/spark2-hdp-yarn-archive.tar.gz | tar t > /dev/null
tar tzvf spark2-hdp-yarn-archive.tar.gz > /dev/null
Michael-Bronson

avatar

@Jay in spite

this is diff case , I post yesterday the thred - https://community.hortonworks.com/questions/217423/spark-application-communicating-with-driver-in-he... , can you help me with this ?

Michael-Bronson

avatar

@Jay . please let me know if I understand it as the following

let say that one of the replica spark2-hdp-yarn-archive.tar.gz , is corrupted

when I run this CLI su - hdfs -c "hdfs fsck /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz"

dose its actually means that fsck will replace the bad one with the good one and status finally will be HEALTHY ?

Michael-Bronson