Support Questions

Find answers, ask questions, and share your expertise

Java heap space:Hive

avatar
Contributor

I have a set of tables (20 tables) which fetches the last loaded date and count of the tables.

 

INSERT OVERWRITE TABLE dbo.table PARTITION (last_load_date )
select 'table' as tablename,current_date,count(*) as count,last_load_date from table1 where last_load_date in (select max(last_load_date) from table1)
group by last_load_date

union

select 'table' as tablename,current_date,count(*) as count,last_load_date from table2 where last_load_date in (select max(last_load_date) from table2)
group by last_load_date
.
.
.
Union of 20 TABLES

 

and configured the script  like below and ran the script with command sh table.sh 

 

Table.sh

hive --hiveconf tez.queue.name=Last_date --hiveconf  hive.session.id=data_xxx -f /mypath/union_query.sql

 

I could see the error: 

 

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOfRange(Arrays.java:3664)
        at java.lang.String.<init>(String.java:207)
        at java.lang.StringBuilder.toString(StringBuilder.java:407)
        at java.lang.String.valueOf(String.java:2994)
        at java.lang.StringBuilder.append(StringBuilder.java:131)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDDLFromFieldSchema(MetaStoreUtils.java:876)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.getSchema(MetaStoreUtils.java:1091)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.getPartitionMetadata(MetaStoreUtils.java:890)
        at org.apache.hadoop.hive.ql.metadata.Partition.getMetadataFromPartitionSchema(Partition.java:263)
        at org.apache.hadoop.hive.ql.plan.PartitionDesc.<init>(PartitionDesc.java:87)
        at org.apache.hadoop.hive.ql.exec.Utilities.getPartitionDesc(Utilities.java:1373)
        at org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setMapWork(GenMapRedUtils.java:684)
        at org.apache.hadoop.hive.ql.parse.GenTezUtils.setupMapWork(GenTezUtils.java:212)
        at org.apache.hadoop.hive.ql.parse.GenTezUtils.createMapWork(GenTezUtils.java:195)
        at org.apache.hadoop.hive.ql.parse.GenTezWork.process(GenTezWork.java:131)
        at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
        at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
        at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:90)
        at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
        at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
        at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.walk(GenTezWorkWalker.java:109)
        at org.apache.hadoop.hive.ql.parse.GenTezWorkWalker.startWalking(GenTezWorkWalker.java:72)
        at org.apache.hadoop.hive.ql.parse.TezCompiler.generateTaskTree(TezCompiler.java:368)
        at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:205)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10598)
        at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:219)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:474)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:330)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1233)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1274)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170)

 

Could you please let me know how to optimize the query to avoid the above error as I use only union of tables. I tried to find the solution but I didn't find it in my case.

Thank you in advance for the reply 

  

1 ACCEPTED SOLUTION

avatar
Contributor

why don't you write 20 insert into query in append mode in the same sql file ';' separated, In this case you don't have to use union.

And probably won't encounter heap space issue.

 

View solution in original post

1 REPLY 1

avatar
Contributor

why don't you write 20 insert into query in append mode in the same sql file ';' separated, In this case you don't have to use union.

And probably won't encounter heap space issue.