Member since
05-06-2014
1
Post
0
Kudos Received
0
Solutions
11-10-2014
11:54 AM
Description: We are getting an error while executing a Hive query against a table which has about 2.68 billion records and 430 columns. This table is partitioned by a column 'RDATE'. NOTE: This is a new table and a new query. It was never executed earlier. This is the first time we are executing this query. $hive Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-0.10.0-cdh4.4.0.jar!/hive-log4j.properties Hive history file=/tmp/rajendrap/hive_job_log_21d3abad-7cdd-4268-b961-b752efc827a4_845263541.txt hive> set hive.mapred.mode=nonstrict; hive> set mapred.child.java.opts=-Xmx8g; hive> INSERT OVERWRITE LOCAL DIRECTORY '/namenode/home/rajendrap/temp456' > ROW FORMAT DELIMITED > FIELDS TERMINATED BY ',' > SELECT rdate,count(*) FROM PAX_MV_HIST_PREV_CURR group by rdate order by rdate; Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOf(Arrays.java:2882) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) at java.lang.StringBuilder.append(StringBuilder.java:119) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDDLFromFieldSchema(MetaStoreUtils.java:498) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getSchema(MetaStoreUtils.java:711) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getPartitionMetadata(MetaStoreUtils.java:515) at org.apache.hadoop.hive.ql.metadata.Partition.getMetadataFromPartitionSchema(Partition.java:280) at org.apache.hadoop.hive.ql.plan.PartitionDesc.<init>(PartitionDesc.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getPartitionDesc(Utilities.java:683) at org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan(GenMapRedUtils.java:826) at org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.setTaskPlan(GenMapRedUtils.java:597) at org.apache.hadoop.hive.ql.optimizer.GenMapRedUtils.initPlan(GenMapRedUtils.java:129) at org.apache.hadoop.hive.ql.optimizer.GenMRRedSink1.process(GenMRRedSink1.java:77) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87) at org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:55) at org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67) at org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67) at org.apache.hadoop.hive.ql.parse.GenMapRedWalker.walk(GenMapRedWalker.java:67) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genMapRedTasks(SemanticAnalyzer.java:7883) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8265) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:459) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:349) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:938) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) We set the parameter mapred.child.java.opts to -Xmx8g ( we tried different setting 2 G, 4 G , 8G , 16G) . But every time we got the same error. Please advise any possible solution/tips/recommendation for the above query to run successful.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Hive