Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

tez.queue.name vs hive.server2.tez.default.queues in HiveServer configuration

avatar

In documentation page for "Configure Hive and HiveServer2 for Tez" there are two properties that looks similar to me:

  • tez.queue.name: property to specify which queue will be used for Hive-on-Tez jobs.
  • hive.server2.tez.default.queues: A list of comma separated values corresponding to YARN queues of the same name. When HiveServer2 is launched in Tez mode, this configuration needs to be set for multiple Tez sessions to run in parallel on the cluster.

The only difference that I see is that when using "hive.server2.tez.default.queues" we can specify several queues so I guess jobs will be distributed over these queues. Hence, if we need all Hive jobs running in one queue we should use "tez.queue.name".

Am I missing something here ?

1 REPLY 1

avatar
Master Guru

Essentially hive.server2.tez.default.queues exists for pre initialized Tez sessions. Normally starting an Application Master takes around 10 seconds so the first query will be significantly slow. However you can set hive.server2.tez.initialize.default.sessions=true.

This will initialize hive.server2.tez.sessions.per.default.queue AMs for each of the queues which will then be used for query execution.

For most situations I would not bother with it too much since subsequent queries will reuse existing AMs ( which have an idle wait time ). However if you have strong SLAs you may want to use it.

the tez.queue.name is then the actual queue you want to execute in. If you hit one of the default queues the AM is already there and everything is faster. You might have distinct queues for big heavy and small interactive queries however you still need to set the queue yourself.