Created 03-03-2018 09:09 PM
i have a job that runs every hour, i put a csv file into hdfs location and do an alter table to add that new location to the partition. Weirdly it took more than 50 min when it just takes 5-10 seconds. I am not sure why? how to start root cause analysis on this?
Created 03-04-2018 08:18 PM
Is this smth related to yarn.scheduler.capacity.maximum-am-resource-percent
<value>0.2</value>
which is Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications.
at the time this happened, i had about 5-6 jobs using same queue... i guess this is the reason AM for this job didnt get resources allocated until rest of them finished.
so is the 0.2 mean 20% per queue or altogether all queues ?
Created 03-04-2018 03:29 PM
Created 03-04-2018 04:15 PM
Thanks for reply. My table is huge, msck just hangs.
Also, i see that although job started at 20:28 pm, the container dint launch until 20:55 and i dont see any logs.
what does explain exended do? how to use debug mode for a single query without actually changing in the configurations?
FinalStatus Reported by AM: | SUCCEEDED |
---|---|
Started: | Sat Mar 03 20:21:14 -0500 2018 |
Elapsed: | 34mins, 44sec |
Log Type: launch_container.sh
Log Upload Time: Sat Mar 03 20:55:59 -0500 2018|
Log Length: 9051
Showing 4096 bytes of 9051 total. Click here for the full log.
I dont see anythign in logs. this happened twice during the last 2 days.
Created 03-04-2018 08:18 PM
Is this smth related to yarn.scheduler.capacity.maximum-am-resource-percent
<value>0.2</value>
which is Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications.
at the time this happened, i had about 5-6 jobs using same queue... i guess this is the reason AM for this job didnt get resources allocated until rest of them finished.
so is the 0.2 mean 20% per queue or altogether all queues ?
Created 03-05-2018 02:41 AM
Hi @PJ
Which you have said like it didnt launched the containers till 8:55 which means its not getting the proper resource to start the process and as there are already jobs running in the same queue support the issue as well. try decreasing the value to 0.1 .
Created 03-05-2018 02:33 PM
yeah .. but i thought i have to increase the value so there can be more resources available to launch more AM's...
Created 03-05-2018 10:16 AM
This can be related to HIVE-13901. Depending on the FS, MSCK & Add partition can be slow.
Can you try Setting "hive.fetch.task.conversion=none" ?
Created 03-05-2018 02:46 PM
"i put a csv file into hdfs location and do an alter table to add that new location to the partition". Can you please explain this operation?