Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

what could be the cause for spark2-hdp-yarn-archive.tar.gz corruption

hi all,

we installed new hadoop cluster ( ambari + HDP version 2.6.4 )

after installation , we notice that we have problem with the spark-submit

and finally we found that spark2-hdp-yarn-archive.tar.gz file is corruption

full path - /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz ( from HDFS )

my question is - what could be the reason that this is is corrupted ?

in spite this cluster is new fresh installation

Michael-Bronson
1 ACCEPTED SOLUTION

Super Mentor

@Michael Bronson

As the file path which you shared is on HDFS : /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz

To identify "corrupt" or "missing" blocks, the command-line command can be used to knwo whether it is healthy or not?

# su - hdfs -c "hdfs fsck /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz"
.
Connecting to namenode via http://hdfcluster2.example.com:50070/fsck?ugi=hdfs&path=%2Fhdp%2Fapps%2F2.6.4.0-91%2Fspark2%2Fspark2...
FSCK started by hdfs (auth:SIMPLE) from /172.22.197.159 for path /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz at Wed Sep 05 01:51:25 UTC 2018
.Status: HEALTHY
 Total size:    189997800 B
 Total dirs:    0
 Total files:    1
 Total symlinks:        0
 Total blocks (validated):    2 (avg. block size 94998900 B)
 Minimally replicated blocks:    2 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:    0 (0.0 %)
 Mis-replicated blocks:        0 (0.0 %)
 Default replication factor:    3
 Average block replication:    3.0
 Corrupt blocks:        0
 Missing replicas:        0 (0.0 %)
 Number of data-nodes:        4
 Number of racks:        1
FSCK ended at Wed Sep 05 01:51:25 UTC 2018 in 35 milliseconds
The filesystem under path '/hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz' is HEALTHY

HDFS will attempt to recover the situation automatically. By default there are three replicas of any block in the cluster. so if HDFS detects that one replica of a block has become corrupt or damaged, HDFS will create a new replica of that block from a known-good replica, and will mark the damaged one for deletion.

The chances of three replicas of the same block becoming damaged is so remote that it would suggest a significant failure somewhere else in the cluster. If this situation does occur, and all three replicas are damaged, then 'hdfs fsck' will report that block as "corrupt" - i.e. HDFS cannot self-heal the block from any of its replicas.


Although there are some Articles which can be referred to fix the "Under replicated Blocks" like:
https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html

How to fix missing/corrupted/under or over-replicated blocks?
https://community.hortonworks.com/content/supportkb/49106/how-to-fix-missingcorruptedunder-or-over-r...

.

View solution in original post

6 REPLIES 6

@Michael Bronson

What kind of corruption is that? file is incomplete or less size that it should be?

I cant tell you exaclty but after I tar again the files , this solve my problem

Michael-Bronson

Super Mentor

@Michael Bronson

As the file path which you shared is on HDFS : /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz

To identify "corrupt" or "missing" blocks, the command-line command can be used to knwo whether it is healthy or not?

# su - hdfs -c "hdfs fsck /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz"
.
Connecting to namenode via http://hdfcluster2.example.com:50070/fsck?ugi=hdfs&path=%2Fhdp%2Fapps%2F2.6.4.0-91%2Fspark2%2Fspark2...
FSCK started by hdfs (auth:SIMPLE) from /172.22.197.159 for path /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz at Wed Sep 05 01:51:25 UTC 2018
.Status: HEALTHY
 Total size:    189997800 B
 Total dirs:    0
 Total files:    1
 Total symlinks:        0
 Total blocks (validated):    2 (avg. block size 94998900 B)
 Minimally replicated blocks:    2 (100.0 %)
 Over-replicated blocks:    0 (0.0 %)
 Under-replicated blocks:    0 (0.0 %)
 Mis-replicated blocks:        0 (0.0 %)
 Default replication factor:    3
 Average block replication:    3.0
 Corrupt blocks:        0
 Missing replicas:        0 (0.0 %)
 Number of data-nodes:        4
 Number of racks:        1
FSCK ended at Wed Sep 05 01:51:25 UTC 2018 in 35 milliseconds
The filesystem under path '/hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz' is HEALTHY

HDFS will attempt to recover the situation automatically. By default there are three replicas of any block in the cluster. so if HDFS detects that one replica of a block has become corrupt or damaged, HDFS will create a new replica of that block from a known-good replica, and will mark the damaged one for deletion.

The chances of three replicas of the same block becoming damaged is so remote that it would suggest a significant failure somewhere else in the cluster. If this situation does occur, and all three replicas are damaged, then 'hdfs fsck' will report that block as "corrupt" - i.e. HDFS cannot self-heal the block from any of its replicas.


Although there are some Articles which can be referred to fix the "Under replicated Blocks" like:
https://community.hortonworks.com/articles/4427/fix-under-replicated-blocks-in-hdfs-manually.html

How to fix missing/corrupted/under or over-replicated blocks?
https://community.hortonworks.com/content/supportkb/49106/how-to-fix-missingcorruptedunder-or-over-r...

.

@Jay , very nice solution

until now I was doing this , in ordeer to verify the file

gzip -t /var/tmp/spark2-hdp-yarn-archive.tar.gz
gunzip -c /var/tmp/spark2-hdp-yarn-archive.tar.gz | tar t > /dev/null
tar tzvf spark2-hdp-yarn-archive.tar.gz > /dev/null
Michael-Bronson

@Jay in spite

this is diff case , I post yesterday the thred - https://community.hortonworks.com/questions/217423/spark-application-communicating-with-driver-in-he... , can you help me with this ?

Michael-Bronson

@Jay . please let me know if I understand it as the following

let say that one of the replica spark2-hdp-yarn-archive.tar.gz , is corrupted

when I run this CLI su - hdfs -c "hdfs fsck /hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz"

dose its actually means that fsck will replace the bad one with the good one and status finally will be HEALTHY ?

Michael-Bronson
Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.