<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: bucketId out of range: 4147 in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/bucketId-out-of-range-4147/m-p/304323#M221982</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/38961"&gt;@balajip&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;thanks for the reply. excuse my ignorace as I am still new to cloudera platform.&amp;nbsp;&lt;BR /&gt;Is there a config that I can set to override the bucket limit? or should I apply that patch?&lt;/P&gt;</description>
    <pubDate>Wed, 14 Oct 2020 07:23:11 GMT</pubDate>
    <dc:creator>dieden9</dc:creator>
    <dc:date>2020-10-14T07:23:11Z</dc:date>
    <item>
      <title>bucketId out of range: 4147</title>
      <link>https://community.cloudera.com/t5/Support-Questions/bucketId-out-of-range-4147/m-p/304261#M221956</link>
      <description>&lt;P&gt;Hi!&lt;BR /&gt;&lt;BR /&gt;I am running a scheduled job that consists of an insert-select query in hive 3.0/hdp 3 as the following&amp;nbsp;&lt;/P&gt;&lt;P&gt;Insert into table t1 as select * from t2 where timestamp &amp;gt; "predefined timestamp"&lt;/P&gt;&lt;P&gt;The job was running flawless until out of sudden it started failing with the following error:&lt;/P&gt;&lt;PRE&gt;Caused by: java.lang.IllegalArgumentException: bucketId out of range: 4147
[2020-10-13 06:58:12,214] INFO - 	at org.apache.hadoop.hive.ql.io.BucketCodec$2.encode(BucketCodec.java:94)
[2020-10-13 06:58:12,214] INFO - 	at org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.&amp;lt;init&amp;gt;(OrcRecordUpdater.java:271)
[2020-10-13 06:58:12,214] INFO - 	at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getRecordUpdater(OrcOutputFormat.java:278)
[2020-10-13 06:58:12,214] INFO - 	at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordUpdater(HiveFileFormatUtils.java:350)
[2020-10-13 06:58:12,214] INFO - 	at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getAcidRecordUpdater(HiveFileFormatUtils.java:336)
[2020-10-13 06:58:12,214] INFO - 	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
[2020-10-13 06:58:12,214] INFO - 	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:741)
[2020-10-13 06:58:12,214] INFO - 	... 45 more
[2020-10-13 06:58:12,214] INFO - ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:3493, Vertex vertex_1602198520469_101787_31_02 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1602198520469_101787_31_03, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex vertex_1602198520469_101787_31_03 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1 (state=08S01,code=2)

&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;I am clueless of what is causing this especially that the job hasn't changed.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any idea of how I can solve this issue? &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Oct 2020 07:13:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/bucketId-out-of-range-4147/m-p/304261#M221956</guid>
      <dc:creator>dieden9</dc:creator>
      <dc:date>2020-10-13T07:13:45Z</dc:date>
    </item>
    <item>
      <title>Re: bucketId out of range: 4147</title>
      <link>https://community.cloudera.com/t5/Support-Questions/bucketId-out-of-range-4147/m-p/304265#M221957</link>
      <description>&lt;P&gt;Seems you are hitting the number of max buckets limit in hive.&lt;BR /&gt;For more information please refer below apache jira.&lt;BR /&gt;&lt;A href="https://issues.apache.org/jira/browse/TEZ-4130" target="_blank"&gt;https://issues.apache.org/jira/browse/TEZ-4130&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Oct 2020 07:54:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/bucketId-out-of-range-4147/m-p/304265#M221957</guid>
      <dc:creator>balajip</dc:creator>
      <dc:date>2020-10-13T07:54:34Z</dc:date>
    </item>
    <item>
      <title>Re: bucketId out of range: 4147</title>
      <link>https://community.cloudera.com/t5/Support-Questions/bucketId-out-of-range-4147/m-p/304323#M221982</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/38961"&gt;@balajip&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;thanks for the reply. excuse my ignorace as I am still new to cloudera platform.&amp;nbsp;&lt;BR /&gt;Is there a config that I can set to override the bucket limit? or should I apply that patch?&lt;/P&gt;</description>
      <pubDate>Wed, 14 Oct 2020 07:23:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/bucketId-out-of-range-4147/m-p/304323#M221982</guid>
      <dc:creator>dieden9</dc:creator>
      <dc:date>2020-10-14T07:23:11Z</dc:date>
    </item>
  </channel>
</rss>

