Most of cases, Tez session needs to be re-opened when ODBC
client accesses hive table via API. The session initialization takes about
10-15 seconds.
Changed this setting from false to true and response time from web
was improved:
hive.server2.tez.initialize.default.sessions = true;
However, after the change, query executions lost parallelism:
observed one Tez session running blocks others. (session per queue is set to be
1)
My recommendations:
- set
hive.server2.tez.default.queues=default
- set
hive.server2.tez.sessions.per.default.queue=3
- set
hive.server2.tez.initialize.default.sessions=true
- set
hive.prewarm.enabled=true
- set
hive.prewarm.numcontainers=2
- set
tez.am.container.reuse.enabled=true
- set
hive.server2.enable.doAs=false
- set
tez.am.container.idle.release-timeout-min.millis=30000
- set
tez.am.container.idle.release-timeout-max.millis=90000
My question is: Can we
(a) create more queues and have 1 default session per queue or
(b) leave the default queue as is and create 3 sessions per queue?
Which is a better approach?
Thanks,
Kiran