Support Questions

Find answers, ask questions, and share your expertise

alter table add partition took almost an hour

avatar
Expert Contributor

i have a job that runs every hour, i put a csv file into hdfs location and do an alter table to add that new location to the partition. Weirdly it took more than 50 min when it just takes 5-10 seconds. I am not sure why? how to start root cause analysis on this?

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Is this smth related to yarn.scheduler.capacity.maximum-am-resource-percent

<value>0.2</value>

which is Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications.

at the time this happened, i had about 5-6 jobs using same queue... i guess this is the reason AM for this job didnt get resources allocated until rest of them finished.

so is the 0.2 mean 20% per queue or altogether all queues ?

View solution in original post

7 REPLIES 7

avatar
Contributor

@PJ

did you try to explain extended $cmd?

if possible repair it once like msck repair table.

avatar
Expert Contributor

@Vikas Srivastava

Thanks for reply. My table is huge, msck just hangs.

Also, i see that although job started at 20:28 pm, the container dint launch until 20:55 and i dont see any logs.

what does explain exended do? how to use debug mode for a single query without actually changing in the configurations?

FinalStatus Reported by AM:SUCCEEDED
Started:Sat Mar 03 20:21:14 -0500 2018
Elapsed:34mins, 44sec

Log Type: launch_container.sh
Log Upload Time: Sat Mar 03 20:55:59 -0500 2018|
Log Length: 9051
Showing 4096 bytes of 9051 total. Click here for the full log.

I dont see anythign in logs. this happened twice during the last 2 days.

avatar
Expert Contributor

Is this smth related to yarn.scheduler.capacity.maximum-am-resource-percent

<value>0.2</value>

which is Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications.

at the time this happened, i had about 5-6 jobs using same queue... i guess this is the reason AM for this job didnt get resources allocated until rest of them finished.

so is the 0.2 mean 20% per queue or altogether all queues ?

avatar
Contributor

Hi @PJ

Which you have said like it didnt launched the containers till 8:55 which means its not getting the proper resource to start the process and as there are already jobs running in the same queue support the issue as well. try decreasing the value to 0.1 .

avatar
Expert Contributor
@Vikas Srivastava

yeah .. but i thought i have to increase the value so there can be more resources available to launch more AM's...

avatar
Expert Contributor

This can be related to HIVE-13901. Depending on the FS, MSCK & Add partition can be slow.

Can you try Setting "hive.fetch.task.conversion=none" ?

avatar

"i put a csv file into hdfs location and do an alter table to add that new location to the partition". Can you please explain this operation?