Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive job is failing after a long run with "Cannot add more than 2147483647 elements to a PTF Partition hive" error

Highlighted

Hive job is failing after a long run with "Cannot add more than 2147483647 elements to a PTF Partition hive" error

Contributor
INSERT INTO TABLE datamart.fact
SELECT row_number() over(order by visitor_key) as clickstream_key,
visitor_key,
session_key,
page_key,
hit_time,  
visit_page_num,
campaign,
search_keyword,
post_product_list,
payment,
download,
emailing_id
from datamart.cf;

When i run this query, it is launching around 169 map jobs and 1009 reduce jobs. All the map jobs and 1008 reduce jobs are completing in less than 30 mins, However the last reduce job is taking too long and doing new attempts for the task rather than failing the job with error.

I have gone through the logs, there werent any other warnings or notable errors except for this.

 [TezChild] |tez.TezProcessor|: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":0,"reducesinkkey1":1},"value":{"_col0":14858534,"_col1":96293756,"_col2":13528511,"_col3":"2016-03-06 12:51:14","_col4":17,"_col5":"","_col6":"","_col7":";;;;;103=::hash::0|104=::hash::0|111=::hash::0|133=::hash::0","_col8":"","_col9":"","_col10":null}} at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:352) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:237) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:252) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:150) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot add more than 2147483647 elements to a PTFPartition at org.apache.hadoop.hive.ql.exec.PTFPartition.append(PTFPartition.java:99) at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.processRow(PTFOperator.java:319) at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:130) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:838) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$GroupIterator.next(ReduceRecordSource.java:343) ... 17 more 

Does this have anything to do with my existing configuration?

Size of mapper, reducer are 10 GB, 20 GB respectively.

Is there a work around this ? When I was googling around with the content of logs, I found out that there is an existing bug in tez (https://issues.apache.org/jira/browse/TEZ-3103 ) . Is this the same bug I am encountered here ? I would really appreciate if someone could help me around this.

Note :

no of records of cf : 2968859945

Size of cf : 26.5 G

4 REPLIES 4

Re: Hive job is failing after a long run with "Cannot add more than 2147483647 elements to a PTF Partition hive" error

Super Guru
@vinay kumar

I am not hundred percent sure but I think you need to reduce your reducer size. You are getting this error, in the following file and it expects the number of rows for that particular reducer to be less than Inter.MAX_VALUE (see line 99). I think you have more than 2147483647 rows being processed by this one reducer. If you reduce the size of reducers such that no reducer processes more than 2147483647 records, then you should not run into this issue.

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/PTFPartition.j... (check line 99)

I hope this helps.

Re: Hive job is failing after a long run with "Cannot add more than 2147483647 elements to a PTF Partition hive" error

New Contributor

Should this be filed to hive jira?

Re: Hive job is failing after a long run with "Cannot add more than 2147483647 elements to a PTF Partition hive" error

Contributor

It seems the error is because of the return type of row_number() which is int.

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/PTFPartition.j...

I am trying to generate unique key for around 3 billion records which is more than max value of integer type, which is

2147483647.

Re: Hive job is failing after a long run with "Cannot add more than 2147483647 elements to a PTF Partition hive" error

@vinay kumar

I think "order by visitor_key" is taking huge memory to process.

I would suggest you to run without "Order By"