I run a script with tez and it worked, but then I tried to run it without tez and there was this line in the logs
tez session was closed reopenning
the settings I had with tez
set hive.execution.engine=tez; set tez.queue.name=adhoc; set hive.tez.container.size=4096; set hive.auto.convert.join=true; set hive.exec.parallel=true; set hive.tez.auto.reducer.parallelism=true;
SET mapreduce.job.queuename = adhoc; SET mapreduce.job.reduces = 100; SET hive.exec.parallel.thread.number = 8; SET hive.cli.print.header=true; SET hive.exec.dynamic.partition.mode=nonstrict; SET hive.exec.dynamic.partition=true; SET hive.exec.parallel=true; SET hive.cli.print.current.db=true; set hive.auto.convert.join=false; set hive.resultset.use.unique.column.names=false;
The code was identical in both cases.
I run scripts from HUE.
Why did it happen?
Hive can use Tez based on thew following settings. If you want to disable it then you will need to use
Following link show how to enable or disable it.
On the hive-site.xml we can see:
<property> <name>hive.execution.engine</name> <value>tez</value> </property>
By default hvie query will be executed in mr mode. If there are any changes in the conf file 'set hive.execution.engine=tez' then by default it will be executed in tez mode. if you want it to run in mapreduce mode then as suggested by jay set hive execution engine to mr by set hive.execution.engine=mr; You can set this in the properties or run this command during your session. changing the properties will allow you to run in execution engine specified whenever you log in. Where as if you define it in the session it will be refreshed once the session is logged out.
As @Jay SenSharma pointed the default execution engine setting is picked up from hive.execution.engine property value from HIVE_CONF_DIR/hive-site.xml. In your case its likely tez and hence you are seeing the same behavior explicitly setting it on your cli or not.
As for the message "tez session was closed reopenning" that happens when your session has been idle for a while in which case the AM is reclaimed and the next query will trigger a request for a fresh AM. This is based on the property "tez.session.am.dag.submit.timeout.secs", default value is 5 minutes.