Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How do I stop beeline holding onto YARN containers after a job (performing a hive insert into) has successfully run?

avatar
Expert Contributor

Hi,

When I run an insert into command through beeline Hive/Tez requests 2 containers. Once beeline reports that the row was successfully inserted in to the table I see that the job created (seen in the YARN Manager UI) is still running and holds on to one of the two containers, when I terminate beeline the job listed in the Manager UI then lists as completed.

Why is this happening and how can I change my hadoop configuration to stop this happening?

Thanks,

Mike

1 ACCEPTED SOLUTION

avatar
Master Guru

@mike harding check if you have tez reuse container turned on

tez.am.container.reuse.enabled=true
Configuration that specifies whether a container should be reused.

This allow other application to reuse tez containers to increase performance. turn it off if you are not interested in that functinoality.

View solution in original post

5 REPLIES 5

avatar
Master Guru

@mike harding check if you have tez reuse container turned on

tez.am.container.reuse.enabled=true
Configuration that specifies whether a container should be reused.

This allow other application to reuse tez containers to increase performance. turn it off if you are not interested in that functinoality.

avatar
Master Guru

what is

tez.am.session.min.held-containers

set to?

avatar
Expert Contributor

..I found that in my ambari settings this was not specified - on setting this to 0 and setting tez.session.am.dag.submit.timeout.secs to a smaller amount gave me the behaviour i was looking for.

avatar
Expert Contributor

@Sunile Manjee I checked this and it is false. The remaining container seems to be the application master. When I run the Hive jobs via MapReduce2 they complete fine, its just when they are run in Tez I see this behaviour.

avatar
Expert Contributor

@mike harding to add to this, Tez by default first initializes an AM whereas MapReduce does so at submission only. This is the reason you see the behavior you describe. The tez container has a timeout setting as you stated and that will determine how long lived that initial AM is