Created 12-08-2016 01:26 AM
Hello,
I installed HDP-2.5.3 recently and have an issue with MSCK REPAIR TABLE. I tried to migrate data from another cluster and created an external hive table on it. The command was successfully done with until 170 data directories and it was even very quick like 3 seconds. However, when I tried it with 190 or more data directories, it was hanging somewhere until I killed it in a few hours.
I looked at hivemetastore.log and found it didn't proceed any more after having called 'get_partitions'.
I tested it with data with much more directories like thousands in another cluster where HDP-2.4.X is installed and it worked without any problem. And 'get_partitions_with_auth' was called instead of 'get_partitions'. I compared their configs one by one in ambari but don't see any difference.
Does anyone have any ideas?
Created 02-02-2017 07:19 PM
We checked out the hive code, removed the patch which was causing this issue ( top two commits affecting org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java). This issue was resolved after that.
Created 06-16-2017 09:18 AM
For my personal case, it does not help
Created 02-03-2017 05:06 AM
@PJ - Would it be possible to share the hive.log when you observed this?
Created 06-16-2017 07:54 AM
I have the same problem too, migrating to HDP2.5 was fatal for our heavy msck repair table treatment (to make an partitionned external table usable) Hive.log is empty when it happens Without any other solution, I'll have to recreate my own version of the tool... Sad to come to that end.