Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

msck repair table bad behaviour

avatar
Explorer

Hello,

My client is asking me a way to backup hive tables on tape. I know, this is not "big-data style". This is mandatory for them so I need to accomodate.

 

I found out a way to do this, but the procedure implies, when restoring, this procedure:

- create the table using the DDL previously backed up via "show create table" statement;

- mv the files to the warehouse dir/db/table just created;

- run msck repair table on that table.

 

The command works without error, however I found out that the original table has got about 111 million records, and the target only has got 37 millions.

I compared the hdfs size of the folder and they are the same.

I compared the number of partitions of the table and they are the same.

I tried to run msck repair once again (just in case), but the result doesn't change.

So I think the problem must be in the msck command: files are in place, but somehow it skips some in fixing.

 

What do you think ?


Bye

Omar

1 ACCEPTED SOLUTION

avatar
Explorer

I resolved the problem on my own, I just want to point out that this strange behaviour was due to some incorrectness on data. 

At some point in time, partitioned data went from "table_folder/one_partition/another_partition" to "table_foldere/another_partition/one_partition"

This caused the msck repair command to fail, only aligning metastore data to the latter partition type. 

 

At the moment I don't know what caused the inversion, I asked the dev team and they also don't know. 

By the way, fixing this problem (by recreating the table with the partition order in the correct way) let msck repair to work correctly.

 

Bye

Omar

View solution in original post

1 REPLY 1

avatar
Explorer

I resolved the problem on my own, I just want to point out that this strange behaviour was due to some incorrectness on data. 

At some point in time, partitioned data went from "table_folder/one_partition/another_partition" to "table_foldere/another_partition/one_partition"

This caused the msck repair command to fail, only aligning metastore data to the latter partition type. 

 

At the moment I don't know what caused the inversion, I asked the dev team and they also don't know. 

By the way, fixing this problem (by recreating the table with the partition order in the correct way) let msck repair to work correctly.

 

Bye

Omar