- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
MSCK REPAIR TABLE hangs when hdfs directories of the target table has more than certain number of sub-directories
- Labels:
-
Apache Hive
Created ‎12-08-2016 01:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I installed HDP-2.5.3 recently and have an issue with MSCK REPAIR TABLE. I tried to migrate data from another cluster and created an external hive table on it. The command was successfully done with until 170 data directories and it was even very quick like 3 seconds. However, when I tried it with 190 or more data directories, it was hanging somewhere until I killed it in a few hours.
I looked at hivemetastore.log and found it didn't proceed any more after having called 'get_partitions'.
I tested it with data with much more directories like thousands in another cluster where HDP-2.4.X is installed and it worked without any problem. And 'get_partitions_with_auth' was called instead of 'get_partitions'. I compared their configs one by one in ambari but don't see any difference.
Does anyone have any ideas?
Created ‎02-02-2017 07:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We checked out the hive code, removed the patch which was causing this issue ( top two commits affecting org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java). This issue was resolved after that.
Created ‎12-08-2016 01:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
add condition and give a try, I hope that will help you.
Ex:
alter table abc DROP PARTITION (date=>='2015-09-01');
Created ‎12-08-2016 01:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yeah, I tried it and it worked even though still didn't try it with more than 170 directories. However, I really need to use MSCK REPAIR TABLE instead of ALTER TABLE to add thousands of partitions at a time.
Created ‎12-10-2016 12:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you take the "jstack" output of the hive cli from when it is stuck and share it here? Would be helpful if you could share hive.log as well.
Created ‎01-06-2017 01:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Facing the same issue with MSCK REPAIR TABLE after migrating to HDP-2.5.3. Anyone had luck with a workaround?
Thanks in advance!
Created ‎02-02-2017 03:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am facing same issue after upgrading to 2.5.3 from 2.4.2..did anybody find a solution?
Created ‎02-02-2017 03:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am facing same issue after upgrading to 2.5.3 from 2.4.2..did anybody find a solution?
Created ‎02-02-2017 07:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We checked out the hive code, removed the patch which was causing this issue ( top two commits affecting org/apache/hadoop/hive/ql/metadata/HiveMetaStoreChecker.java). This issue was resolved after that.
Created ‎02-02-2017 11:36 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you please suggest the change that you have done?
Created ‎02-03-2017 04:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you please check if "hive.mv.files.thread=0" helps in this case?
