Reply
New Contributor
Posts: 2
Registered: ‎07-23-2018

Error insert into Hive table (Execution Error, return code 2)

[ Edited ]

Hello!
Can anyone help me?


I upload the CSV files into HDFS and then throw them in the Hive as:

 

create temporary external table test
(
dt TIMESTAMP,
uid INT
)
row format delimited fields terminated by '|'
stored as textfile
location '/my_files/test';

 

insert into table my_table partition(name, md) select dt, uid, name, md from test;


if the files are small then everything is fine, but if the size is large the process falls on the insert with an error:

 

Error during job, getting debugging information ...
Examining task ID: task_1522750657070_0091_m_000001 (and more) from job job_1522750657070_0091
Examining task ID: task_1522750657070_0091_r_000001 (and more) from job job_1522750657070_0091

Task with the most failures (4):
-----
Task ID:
  task_1522750657070_0091_m_000002

URL:
  http://hadoop-namenode.mech.com:8088/taskdetails.jsp?jobid=job_1522750657070_0091&tipid=task_1522750...
-----
Diagnostic Messages for this Task:

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask


in the YARN logs I see the following:

 

WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child: java.net.ConnectException: Call From hadoop-datanode02.prod.analytics.wz-ams.lo.mobbtech.com/127.0.1.1 to hadoop- datanode02.prod.analytics.wz-ams.lo.mobbtech.com:22232 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused

 

all other messages INFO only 

 

is this the cause of the insert error?
what else can I look to solve the problem?

 

I met the mention that the error code 2 is a memory problem ... Is it so?

Highlighted
Posts: 1,749
Kudos: 365
Solutions: 277
Registered: ‎07-31-2013

Re: Error insert into Hive table (Execution Error, return code 2)

From the information below you appear to have an address resolution issue:

> Call From host-datanode02.domain.com/127.0.1.1 to host-datanode02.domain.com:22232

The address of your datanode02 should be resolving to its external network interface, not the loopback (127.0.x.x). If you run the cluster via Cloudera Manager you can inspect all managed hosts for common DNS misconfiguration via the Hosts tab -> All Hosts page -> 'Inspect All Hosts' button on page.

The reason why it may be succeeding for small writes is that the job run is likely not executing any tasks on this one (or more) bad resolution hosts. You can check where the map/reduce tasks run on each passing and failing job to identify a specific-to-certain-hosts pattern.
Announcements