Reply
Explorer
Posts: 40
Registered: ‎01-13-2017

Hive Spark Query failing

I've a query failing with following errors...what does it exactly mean ?

 

2017-01-26 15:51:20,951 ERROR [main]: status.SparkJobMonitor (RemoteSparkJobMonitor.java:startMonitor(132)) - Failed to monitor Job[ 12] with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(java.util.concurrent.TimeoutException)'
org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.TimeoutException
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:153)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getState(RemoteSparkJobStatus.java:82)
at org.apache.hadoop.hive.ql.exec.spark.status.RemoteSparkJobMonitor.startMonitor(RemoteSparkJobMonitor.java:80)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.monitorJob(RemoteSparkJobRef.java:60)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:107)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1115)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:775)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.util.concurrent.TimeoutException
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:150)
... 23 more
2017-01-26 15:51:20,951 ERROR [main]: status.SparkJobMonitor (SessionState.java:printError(937)) - Failed to monitor Job[ 12] with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(java.util.concurrent.TimeoutException)'
org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.TimeoutException
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:153)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getState(RemoteSparkJobStatus.java:82)
at org.apache.hadoop.hive.ql.exec.spark.status.RemoteSparkJobMonitor.startMonitor(RemoteSparkJobMonitor.java:80)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.monitorJob(RemoteSparkJobRef.java:60)
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:107)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1782)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1539)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1318)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1127)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1115)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:172)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:383)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:775)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:693)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:628)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.util.concurrent.TimeoutException
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:49)
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:150)
... 23 more

2017-01-26 15:51:20,955 ERROR [main]: ql.Driver (SessionState.java:printError(937)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

Posts: 642
Topics: 3
Kudos: 121
Solutions: 67
Registered: ‎08-16-2016

Re: Hive Spark Query failing

It is hitting a timeout while monitoring the status. This is probably similar to the 10 minute timeout for MR, so if a task doesn't provide a progress update for 10 minutes the tasks fails. I haven't confirmed this for Spark.

The bigger question is why is it not getting a status update. What does the RM UI show for where the AM is running compared to where the tasks are running?

I see that you are using the Hive CLI. Can you try from Beeline or another JDBC connection?