Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Is there a way to prevent the start of the Tez AM from the Hive CLI?

avatar
Rising Star

The desired end result is that I want no YARN containers automatically started when I enter the Hive CLI and the engine default is Tez.

hive.prewarm prevents default work containers but I need to stop the Tez AM from launching.

1 ACCEPTED SOLUTION

avatar
Rising Star
7 REPLIES 7

avatar
Master Mentor

avatar
Rising Star

avatar
Master Mentor

avatar
Rising Star

For anyone that is interested in why I want this:

People use Oozie Shell actions to orchestrate work because they are familiar/comfortable with bash.

When using the Hive CLI from a YARN container context and secure cluster the engine specification must be done in an early stage.

For the Hive CLI this means users must change their existing HQL from

 hive -e 'set tez.credentials.path=${HADOOP_TOKEN_FILE_LOCATION}' 

to 

--hiveconf tez.credentials.path=${HADOOP_TOKEN_FILE_LOCATION}

For Sqoop, there is no way to affect this change from configuration and the user must ship a custom hive-site.xml in the distributed cache of the Oozie-shell.  This hive-site.xml must set the engine to MR

If this is not done, Hive and Sqoop will launch a Tez AM that may never even be used but which fails due to missing the delegation token.

These changes are small from from this explanation but rather large for organizations that already have established HQL and the cluster upgrade includes a default engine change from MR to Tez.

avatar

A workaround for this could be launching the hive cli in the following manner:

hive -hiveconf hive.execution.engine=mr

But this would mean that if you want to run any queries in tez you would need to run "set hive.execution.engine=tez;" before running your queries.

avatar

Can you help understand the scenario when this is needed? So the Hive shell is executed but wait until a query is executed for creating AM.. this means there are situations where Hive shell is executed and then exited without executing the query? Wont this be an exception scenario or in your case this is so frequent / regular that a workaround is required. I am sorry, just trying to understand when will such a configuration be needed..

avatar
Master Mentor

@kkane are you still having issues with this? Can you accept best answer or provide your own solution?