Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Hive Query is not executing

New Contributor

Hi All,

I ran the below query from hive CLI.

The query is running for long time and failing after that.

SET hive.tez.container.size=10240; 
SET hive.tez.java.opts=-Xmx8192m; 
set tez.runtime.io.sort.mb=4096; 
set tez.runtime.unordered.output.buffer.size-mb=1024; 
set hive.exec.dynamic.partition=true; 
set hive.exec.dynamic.partition.mode=nonstrict; 
set hive.vectorized.execution.reduce.enabled; 
set hive.execution.engine=tez;
set hive.vectorized.execution.enabled = true;
SELECT 
cust_his.cname AS cname  
,cust_his.creg AS creg 
,Upper(Trim(cust_his.ccountry)) AS ccountry 
,Upper(Trim(cust_his.cloc)) AS cloc
FROM  
customer_history cust_his
WHERE  
cust_his.cust_d BETWEEN 20160501  AND 20160531
AND Substr(Trim(cust_his.cloc), 1, Locate('|',
cust_his.cloc, 1) - 1) <> ''
AND Substr(Trim(cust_his.cloc), 1, Locate('|',
cust_his.cloc, 1) - 1) IS NOT NULL
AND cast(Trim(cust_his.cmfid) as int) NOT IN ( 1,2,3 )
AND cust_his.cmat = '0';

Explain plan:

Plan not optimized by CBO.Stage-0Fetch Operatorlimit:-1Stage-1Map 1File Output Operator [FS_54479]compressed:falseStatistics:Num rows: 54376020 Data size: 19466615160 Basic
stats: COMPLETE Column stats: PARTIALtable:{"input format:":"org.apache.hadoop.mapred.TextInputFormat","output
format:":"org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat","serde:":"org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe"}Select Operator [SEL_54478]outputColumnNames:["_col0","_col1","_col2","_col3"]Statistics:Num rows: 54376020 Data size: 19466615160 Basic
stats: COMPLETE Column stats: PARTIALFilter Operator [FIL_54480] predicate:((((substr(trim(cloc), 1, (locate('|', cloc, 1) -
1)) <> '') and substr(trim(cloc), 1, (locate('|', cloc, 1) - 1)) is not
null) and (not (UDFToInteger(trim(cmfid))) IN (1,2,3))) and (cmat = '8'))
(type: boolean)Statistics:Num rows: 54376020 Data size: 24523585020 Basic
stats: COMPLETE Column stats: PARTIALTableScan [TS_54476]alias:hisStatistics:Num rows: 652512245 Data size: 38164072929328
Basic stats: COMPLETE Column stats: PARTIAL

The table is daily partitioned on cust_d column.

Please help me to resolve this.

Thanks in Advance.

4 REPLIES 4

Master Collaborator

Can you post the failure message?

New Contributor

Error is Halting due to out of memory error. Some times vertex failed

@Ram Adireddy

It will be helpful if you could share the complete stack trace of the error seen.

Explorer

@Ram Adireddy

You may want to tweak the container memory settings on YARN on Ambari if this has to do with OOM from tez tasks.