Support Questions

pmj · ‎03-03-2018

i have a job that runs every hour, i put a csv file into hdfs location and do an alter table to add that new location to the partition. Weirdly it took more than 50 min when it just takes 5-10 seconds. I am not sure why? how to start root cause analysis on this?

pmj · ‎03-04-2018

Is this smth related to yarn.scheduler.capacity.maximum-am-resource-percent

which is Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications.

at the time this happened, i had about 5-6 jobs using same queue... i guess this is the reason AM for this job didnt get resources allocated until rest of them finished.

so is the 0.2 mean 20% per queue or altogether all queues ?

View solution in original post

edu_vikassri · ‎03-04-2018

@PJ

did you try to explain extended $cmd?

if possible repair it once like msck repair table.

pmj · ‎03-04-2018

@Vikas Srivastava

Thanks for reply. My table is huge, msck just hangs.

Also, i see that although job started at 20:28 pm, the container dint launch until 20:55 and i dont see any logs.

what does explain exended do? how to use debug mode for a single query without actually changing in the configurations?

FinalStatus Reported by AM:	SUCCEEDED
Started:	Sat Mar 03 20:21:14 -0500 2018
Elapsed:	34mins, 44sec

Log Type: launch_container.sh
Log Upload Time: Sat Mar 03 20:55:59 -0500 2018|
Log Length: 9051
Showing 4096 bytes of 9051 total. Click here for the full log.

I dont see anythign in logs. this happened twice during the last 2 days.

pmj · ‎03-04-2018

Is this smth related to yarn.scheduler.capacity.maximum-am-resource-percent

which is Maximum percent of resources in the cluster which can be used to run application masters i.e. controls number of concurrent running applications.

at the time this happened, i had about 5-6 jobs using same queue... i guess this is the reason AM for this job didnt get resources allocated until rest of them finished.

so is the 0.2 mean 20% per queue or altogether all queues ?

edu_vikassri · ‎03-05-2018

Hi @PJ

Which you have said like it didnt launched the containers till 8:55 which means its not getting the proper resource to start the process and as there are already jobs running in the same queue support the issue as well. try decreasing the value to 0.1 .

pmj · ‎03-05-2018

@Vikas Srivastava

yeah .. but i thought i have to increase the value so there can be more resources available to launch more AM's...

nramanaiah · ‎03-05-2018

This can be related to HIVE-13901. Depending on the FS, MSCK & Add partition can be slow.

Can you try Setting "hive.fetch.task.conversion=none" ?

RahulSoni · ‎03-05-2018

"i put a csv file into hdfs location and do an alter table to add that new location to the partition". Can you please explain this operation?

Cloudera Community

Support Questions

alter table add partition took almost an hour

Altering existing range partition without data los...

how to alter hive table partition

Adding new columns to an already partitioned Hive ...

Solution: ALTER TABLE PARTITION SET LOCATION does ...

Error when do an alter table change column on an a...

HBase - alter table - add pre-splits

Nifi DatabaseTableSchemaRegistry - PutDatabaseReco...

Problem renaming Hive partitioned table and then d...

how to drop partition table using date_add functio...

Hive's "alter table partition concatenate" not wo...