Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

msck repair table bad behaviour

Solved Go to solution

msck repair table bad behaviour

Contributor

Hello,

My client is asking me a way to backup hive tables on tape. I know, this is not "big-data style". This is mandatory for them so I need to accomodate.

 

I found out a way to do this, but the procedure implies, when restoring, this procedure:

- create the table using the DDL previously backed up via "show create table" statement;

- mv the files to the warehouse dir/db/table just created;

- run msck repair table on that table.

 

The command works without error, however I found out that the original table has got about 111 million records, and the target only has got 37 millions.

I compared the hdfs size of the folder and they are the same.

I compared the number of partitions of the table and they are the same.

I tried to run msck repair once again (just in case), but the result doesn't change.

So I think the problem must be in the msck command: files are in place, but somehow it skips some in fixing.

 

What do you think ?


Bye

Omar

1 ACCEPTED SOLUTION

Accepted Solutions

Re: msck repair table bad behaviour

Contributor

I resolved the problem on my own, I just want to point out that this strange behaviour was due to some incorrectness on data. 

At some point in time, partitioned data went from "table_folder/one_partition/another_partition" to "table_foldere/another_partition/one_partition"

This caused the msck repair command to fail, only aligning metastore data to the latter partition type. 

 

At the moment I don't know what caused the inversion, I asked the dev team and they also don't know. 

By the way, fixing this problem (by recreating the table with the partition order in the correct way) let msck repair to work correctly.

 

Bye

Omar

1 REPLY 1

Re: msck repair table bad behaviour

Contributor

I resolved the problem on my own, I just want to point out that this strange behaviour was due to some incorrectness on data. 

At some point in time, partitioned data went from "table_folder/one_partition/another_partition" to "table_foldere/another_partition/one_partition"

This caused the msck repair command to fail, only aligning metastore data to the latter partition type. 

 

At the moment I don't know what caused the inversion, I asked the dev team and they also don't know. 

By the way, fixing this problem (by recreating the table with the partition order in the correct way) let msck repair to work correctly.

 

Bye

Omar