Support Questions

Find answers, ask questions, and share your expertise

Solution for "Hive Runtime Error while processing row" (only on MR)

avatar
Super Collaborator

We have several queries that fail on MR but succeed on Tez.

When they fail, the logs are full of errors like the ones below. They usually point to specific rows. However, if I reduce the scope of the query, but include the "bad" rows, the queries usually succeed without errors. So it clearly isn't specific to those rows.

I'm guessing there is some kind of overflow happening internally.

I have submitted several instances of this in support tickets, and the feedback is always "please upgrade or just use Tez", but that really isn't a solution, and we just upgraded recently.

I'm looking for guidance on ways that we might tune our Hive or MR settings to work around this.

Thanks.

2016-03-29 08:30:03,751 FATAL [main] org.apache.hadoop.hive.ql.exec.mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {<row data>}
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:534)
	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:176)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException
	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:397)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
	at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:120)
	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:159)
	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:524)
	... 9 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1450)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1346)
	at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
	at org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:186)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:98)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:82)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1146)
	at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:607)
	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.collect(ReduceSinkOperator.java:531)
	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:380)
	... 15 more
8 REPLIES 8

avatar
Super Collaborator

Strange, I have usually seen the other pattern: things were failing with Tez but working with MR. And when going to the last version of HDP, the Tez error was fixed.

Could you tell us why you don't want to use Tez? Tez is usually much faster than MR.

avatar
Super Collaborator

Thanks @Sourygna Luangsay

We do use Tez for many things. I honestly haven't found it to be "much faster than MR", though it is usually a bit faster.

But I like MR because it integrates very well with the Application Manager GUI. I can find all my logs very easily through the GUI, and even share links with my team when there is a stack track or something in the logging that needs attention.

It also makes it very easy to diagnose when one node on our cluster is a bottleneck. When a query runs slowly, I can watch the mappers and reducers, and can easily see which servers are taking the longest.

I don't know of a good way to do any of those things with Tez. We use the Tez View, but it is buggy. And when it works, it takes many more clicks to find answers.

That's just my experience. Maybe there's a better way to leverage Tez...

avatar
Super Collaborator

I guess that Tez being faster than MR generally depends on the kind of queries you have. But this is what I could see in different customer's projects.

Could you tell us which version of HDP you use? I acknowledge that Hive views are not as intuitive as MR Web-UI at the beginning but it does not seem that buggy to me. And you can still send the logs as a URL to people of your team.

As for diagnosing the bottlenecks, I would recommend you to try to use Swimlane with Tez:

https://github.com/apache/tez/tree/master/tez-tools/swimlanes

This is a graphical tool that will help you to understand which container/vertex is the bottleneck in your query.

avatar
Super Collaborator

Cool! I'll check it out.

And to answer your question: we are on 2.2.8

avatar
Explorer

I was able to fix a similar issue with:

set hive.vectorized.execution.enabled=false;

set hive.vectorized.execution.reduce.enabled=false;

I'm not sure why we have to disable vectorized execution, but it fixed this for us. I hope this helps.

avatar

Hi , I tried using the below settings before running the insert command but I got the same error again

set hive.vectorized.execution.enabled=false;

set hive.vectorized.execution.reduce.enabled=false;

avatar
New Contributor

I got it to work using

set hive.auto.convert.join=false;

avatar
New Contributor

you can try below set parameters

set hive.vectorized.execution.reduce.enabled=false;

and 

set hive.vectorized.execution.enabled=true;