Reply
Explorer
Posts: 9
Registered: ‎09-06-2017

return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

[ Edited ]

version: clouder-manager-5.13.1

os: ubuntu14.04

here is my operation step:

1.

create table new_tmp(action_type string,event_detail string, uuid string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';

 

2.

load data inpath '/tmp/new.txt' into table new_tmp;

 

3.

create table new_tmp_orc(action_type string,event_detail string, uuid string)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
STORED AS ORC;

 

4.

insert into table new_tmp_orc select * from new_tmp;

 

 

and It report this error:

  • Error while processing statement: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

in the log file,/var/log/hive/hadoop-cmf-hive-HIVESERVER2-node1.log.out  . it's like this:

2018-01-30 17:15:46,331 ERROR org.apache.hadoop.hive.ql.exec.spark.status.SparkJobMonitor: [HiveServer2-Background-Pool: Thread-58]: Status: Failed
 .........( omit some DEBUG and INFO level messages)
2018-01-30 17:15:46,372 ERROR org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-58]: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

 

.........(omit some DEBUG and INFO level messages)....

2018-01-30 17:15:46,444 ERROR org.apache.hive.service.cli.operation.Operation: [HiveServer2-Background-Pool: Thread-58]: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask
at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:238)
at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:89)
at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:301)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:314)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

 

 

I've searched on the internet for days,still don't get a clue........

really appreciate if you could help 

Cloudera Employee
Posts: 30
Registered: ‎08-16-2016

Re: return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

Its not quite clear what the issue is. We will probably need HS2 and spark logs to do understand the issue.

 

However, I am curious if the second step succeeded.

"load data inpath '/tmp/new.txt' into table new_tmp;"

 

This appears to be a local path but there is no "LOCAL" keyword in the command. Have you verfied that the data was actually inserted into new_tmp table after this step?

 

Also what version of CDH is this? Thanks

Explorer
Posts: 9
Registered: ‎09-06-2017

Re: return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

@NaveenGangam  thanks for your reply

1. the second step is successfully finished. use  ''select * from tablename" can show the contents exactly right.

2. no  'LOCAL' keyword is because the '/tmp/new.txt' is the on the HDFS.

3. CDH version is 5.13.1

4. 

and question is  how could I make this step  successed 

 

insert into table new_tmp_orc select * from new_tmp;

 

is there any documents about loading json data into the orc type table ? cos  I couldn't find any one on the internet,,and I feel so  hopeless.....

Cloudera Employee
Posts: 28
Registered: ‎11-20-2015

Re: return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

It could be a few things.  However, a detail log message should be available in the Spark History Server / YARN Resource Manager UI when you click on the failed job.  The error will be in one of the Executor logs.

 

1// Invalid JSON

 

You could have some invalid JSON that is failing to parse.  Hive will not skip erroneous records, it will simply fail the entire job.

 

2// Not Installing the SerDe

 

This can be confusing for users, but have you installed the JSON Serde into the Hive auxiliary directory?

 

The file that contains this JSON Serde class is: hive-hcatalog-core.jar

 

It can be found in several places in the CDH distribution.  It needs to be installed into the Hive auxilliary directory and the HiveServer2 instances subsquently need to be restarted.

 

https://www.cloudera.com/documentation/enterprise/5-13-x/topics/cm_mc_hive_udf.html

Highlighted
Cloudera Employee
Posts: 28
Registered: ‎11-20-2015

Re: return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

Also, we at Cloudera are partial to the Apache Parquet format:

 

https://www.cloudera.com/documentation/enterprise/5-13-x/topics/cdh_ig_parquet.html

Announcements