Created 06-06-2016 11:19 PM
If both are not same how they are different? Can you please let me know?
Created 06-07-2016 03:58 PM
"Let me put the question this way. If I have hive.execution.engine=tez; why do I need the property hive.server2.tez.initialize.default.sessions to set it to "True"? Whats the use-case for this property? I ran multiple tests but my hive.execution.engine property drives how the query works and not this default sessions property"
The default session parameter has nothing to do with the way the query is executed. It is for pre-creating Tez sessions. If this is false the first query on an empty system will take at least 20seconds to create a session.
Time for a Tez query:
Hiveserver prepare, compilation, ...: ~1sec
Not much you can do here however it continuously gets faster.
Initialize Tez Application Master ( Session 😞 ~10 seconds
To reduce that Hive can reuse Sessions, that are idle, AMs are kept for normally 120s after a query is run. Or you can instantiate default sessions if you cannot live with that delay.
Initialize Containers : 3-10s
The next step is to allocate the work containers to the Session, again Tez can reuse containers or you can preheat containers. ( pre allocate the containers )
The actual query
That depends on your data.
Created 06-07-2016 06:24 AM
Start Tez session at Initialization - Enables a user to use HiveServer2 without enabling Tez for HiveServer2. Users might potentially want to run queries with Tez without a pool of sessions.
Default value is False
hive.execution.engine=tez - This setting determines whether Hive queries will be executed using Tez or MapReduce.
Default value is - If this value is set to "mr," Hive queries will be executed using MapReduce. If this value is set to "tez," Hive queries will be executed using Tez. All queries executed through HiveServer2 will use the specified hive.execution.engine setting.
Created 06-07-2016 02:02 PM
Let me put the question this way. If I have hive.execution.engine=tez; why do I need the property hive.server2.tez.initialize.default.sessions to set it to "True"? Whats the use-case for this property? I ran multiple tests but my hive.execution.engine property drives how the query works and not this default sessions property.
Created 06-07-2016 03:58 PM
"Let me put the question this way. If I have hive.execution.engine=tez; why do I need the property hive.server2.tez.initialize.default.sessions to set it to "True"? Whats the use-case for this property? I ran multiple tests but my hive.execution.engine property drives how the query works and not this default sessions property"
The default session parameter has nothing to do with the way the query is executed. It is for pre-creating Tez sessions. If this is false the first query on an empty system will take at least 20seconds to create a session.
Time for a Tez query:
Hiveserver prepare, compilation, ...: ~1sec
Not much you can do here however it continuously gets faster.
Initialize Tez Application Master ( Session 😞 ~10 seconds
To reduce that Hive can reuse Sessions, that are idle, AMs are kept for normally 120s after a query is run. Or you can instantiate default sessions if you cannot live with that delay.
Initialize Containers : 3-10s
The next step is to allocate the work containers to the Session, again Tez can reuse containers or you can preheat containers. ( pre allocate the containers )
The actual query
That depends on your data.
Created 06-07-2016 04:34 PM
Thanks, This makes sense, so its always better to set the value to "True" rt?
Created 06-07-2016 09:08 PM
Personally I like it off. It binds extra resources in the cluster and the second query will be fast anyway. You also need to know how many sessions you want in advance since it will redistrube queries to the precreated seasons. If you don't care about the first query on a cold system being slow keeping it off is the safer choice IMO