Created 03-15-2017 04:55 AM
Hi All,
I ran the below query from hive CLI.
The query is running for long time and failing after that.
SET hive.tez.container.size=10240; SET hive.tez.java.opts=-Xmx8192m; set tez.runtime.io.sort.mb=4096; set tez.runtime.unordered.output.buffer.size-mb=1024; set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; set hive.vectorized.execution.reduce.enabled; set hive.execution.engine=tez; set hive.vectorized.execution.enabled = true; SELECT cust_his.cname AS cname ,cust_his.creg AS creg ,Upper(Trim(cust_his.ccountry)) AS ccountry ,Upper(Trim(cust_his.cloc)) AS cloc FROM customer_history cust_his WHERE cust_his.cust_d BETWEEN 20160501 AND 20160531 AND Substr(Trim(cust_his.cloc), 1, Locate('|', cust_his.cloc, 1) - 1) <> '' AND Substr(Trim(cust_his.cloc), 1, Locate('|', cust_his.cloc, 1) - 1) IS NOT NULL AND cast(Trim(cust_his.cmfid) as int) NOT IN ( 1,2,3 ) AND cust_his.cmat = '0';
Explain plan:
Plan not optimized by CBO.Stage-0Fetch Operatorlimit:-1Stage-1Map 1File Output Operator [FS_54479]compressed:falseStatistics:Num rows: 54376020 Data size: 19466615160 Basic stats: COMPLETE Column stats: PARTIALtable:{"input format:":"org.apache.hadoop.mapred.TextInputFormat","output format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"}Select Operator [SEL_54478]outputColumnNames:["_col0","_col1","_col2","_col3"]Statistics:Num rows: 54376020 Data size: 19466615160 Basic stats: COMPLETE Column stats: PARTIALFilter Operator [FIL_54480] predicate:((((substr(trim(cloc), 1, (locate('|', cloc, 1) - 1)) <> '') and substr(trim(cloc), 1, (locate('|', cloc, 1) - 1)) is not null) and (not (UDFToInteger(trim(cmfid))) IN (1,2,3))) and (cmat = '8')) (type: boolean)Statistics:Num rows: 54376020 Data size: 24523585020 Basic stats: COMPLETE Column stats: PARTIALTableScan [TS_54476]alias:hisStatistics:Num rows: 652512245 Data size: 38164072929328 Basic stats: COMPLETE Column stats: PARTIAL
The table is daily partitioned on cust_d column.
Please help me to resolve this.
Thanks in Advance.
Created 03-15-2017 05:36 AM
Can you post the failure message?
Created 03-15-2017 05:40 AM
Error is Halting due to out of memory error. Some times vertex failed
Created 03-15-2017 06:07 AM
It will be helpful if you could share the complete stack trace of the error seen.
Created 03-27-2017 10:31 PM
You may want to tweak the container memory settings on YARN on Ambari if this has to do with OOM from tez tasks.