Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar
Contributor

ISSUE : Simple select query against an ORC table without limit clause is failing with the below exception.

2017-06-26 16:00:35,228 ERROR [main]: CliDriver (SessionState.java:printError(993)) - Failed with exception java.io.IOException:java.lang.RuntimeException: serious problem java.io.IOException: java.lang.RuntimeException: serious problem at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:520) at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:427) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1765) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:237) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:169) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:380) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:740) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:685) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:233) at org.apache.hadoop.util.RunJar.main(RunJar.java:148) Caused by: java.lang.RuntimeException: serious problem at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1258) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1285) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:371) at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:303) at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:458) ... 15 more Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1253) ... 19 more Caused by: java.lang.OutOfMemoryError: Java heap space 

We can see the query is failing when it is trying to generate ORC splits.

WORKAROUND:

hive.exec.orc.split.strategy=BI

What strategy ORC should use to create splits for execution. The available options are "BI", "ETL" and "HYBRID". Default setting is HYBRID

The HYBRID mode reads the footers for all files if there are fewer files than expected mapper count, switching over to

generating 1 split per file if the average file sizes are smaller than the default HDFS blocksize. ETL strategy always reads the

ORC footers before generating splits, while the BI strategy generates per-file splits fast without reading any data from HDFS.

8,988 Views