<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: select count(*) fails with tez over cassandra in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167437#M41644</link>
    <description>&lt;P&gt;Looks a bug in either Hive or Tez. Would you mind filing a jira in apache and uploading logs there?&lt;/P&gt;</description>
    <pubDate>Tue, 27 Sep 2016 07:37:50 GMT</pubDate>
    <dc:creator>zyang</dc:creator>
    <dc:date>2016-09-27T07:37:50Z</dc:date>
    <item>
      <title>select count(*) fails with tez over cassandra</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167436#M41643</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I have a table in cassandra, and I use the driver hive-cassandra to do selects over it. This is the table&lt;/P&gt;&lt;P&gt;CREATE TABLE table1 (
  campaign_id text,
  sid text,
  name text,
  ts timestamp,
  PRIMARY KEY (campaign_id, sid)
) WITH CLUSTERING ORDER BY (sid ASC)&lt;/P&gt;&lt;P&gt;And I have only 3 partitions&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="7934-qxqo5.png" style="width: 303px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/20297i716E44B0AE4E45D9/image-size/medium?v=v2&amp;amp;px=400" role="button" title="7934-qxqo5.png" alt="7934-qxqo5.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;At the moment to query my table using hive like that&lt;/P&gt;&lt;P&gt;hive -e "select count(*) from table1;"&lt;/P&gt;&lt;P&gt;I got this error&lt;/P&gt;&lt;PRE&gt;Status: Failed
Vertex failed, vertexName=Map 1, 
vertexId=vertex_1474275943985_0179_1_00, diagnostics=[Task failed, 
taskId=task_1474275943985_0179_1_00_000001, diagnostics=[TaskAttempt 0 
failed, info=[Error: Failure while running 
task:java.lang.RuntimeException: 
org.apache.tez.dag.api.TezUncheckedException: Expected length: 12416 
actual length: 9223372036854775711
   at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171)
   at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137)
   at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344)
   at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
   at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
   at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
   at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
   at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.tez.dag.api.TezUncheckedException: Expected length: 12416 actual length: 9223372036854775711
   at org.apache.hadoop.mapred.split.TezGroupedSplit.readFields(TezGroupedSplit.java:128)
   at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
   at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
   at org.apache.tez.mapreduce.hadoop.MRInputHelpers.createOldFormatSplitFromUserPayload(MRInputHelpers.java:177)
   at org.apache.tez.mapreduce.lib.MRInputUtils.getOldSplitDetailsFromEvent(MRInputUtils.java:136)
   at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:643)
   at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:621)
   at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
   at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
   at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:390)
   at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:128)
   at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147)
   ... 14 more


&lt;/PRE&gt;&lt;P&gt;So far I understand that in readfields we are getting more data that we are expecting. But considering the size of the table, I dont think the data is a problem. &lt;/P&gt;&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3486/cstanca.html" nodeid="3486" target="_blank"&gt;@Constantin Stanca&lt;/A&gt; has helped me trying to find the problem, I am re lauching the subjet &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Another thing to add is that if I do select * it works perfectly fine with tez &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; . Using the engine mp, select count(*) and select * works fine also.&lt;/P&gt;&lt;P&gt;We are using hortonworks version 2.3.2&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 11:08:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167436#M41643</guid>
      <dc:creator>jean_jeancarl48</dc:creator>
      <dc:date>2019-08-18T11:08:39Z</dc:date>
    </item>
    <item>
      <title>Re: select count(*) fails with tez over cassandra</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167437#M41644</link>
      <description>&lt;P&gt;Looks a bug in either Hive or Tez. Would you mind filing a jira in apache and uploading logs there?&lt;/P&gt;</description>
      <pubDate>Tue, 27 Sep 2016 07:37:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167437#M41644</guid>
      <dc:creator>zyang</dc:creator>
      <dc:date>2016-09-27T07:37:50Z</dc:date>
    </item>
    <item>
      <title>Re: select count(*) fails with tez over cassandra</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167438#M41645</link>
      <description>&lt;P style="margin-left: 20px;"&gt;&lt;A rel="user" href="https://community.cloudera.com/users/10597/zyang.html" nodeid="10597"&gt;@zyang&lt;/A&gt; &lt;A rel="user" href="https://community.cloudera.com/users/3486/cstanca.html" nodeid="3486"&gt;@Constantin Stanca&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I created the ticket &lt;A href="https://issues.apache.org/jira/browse/TEZ-3451" target="_blank"&gt;https://issues.apache.org/jira/browse/TEZ-3451&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3486/cstanca.html" nodeid="3486"&gt;&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2016 16:21:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167438#M41645</guid>
      <dc:creator>jean_jeancarl48</dc:creator>
      <dc:date>2016-09-29T16:21:36Z</dc:date>
    </item>
    <item>
      <title>Re: select count(*) fails with tez over cassandra</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167439#M41646</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/12333/jeanjeancarl481.html"&gt;jean rivera&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I think that I finally found the reason: &lt;A href="https://issues.apache.org/jira/browse/HIVE-14857?jql=text%20~%20%22select%20count%22" target="_blank"&gt;https://issues.apache.org/jira/browse/HIVE-14857?jql=text%20~%20%22select%20count%22&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Probably the ticket you filed is a duplicate.&lt;/P&gt;&lt;P&gt;I know that it is not fixing your issue now, but if you find the response helpful, please vote/accept best answer.&lt;/P&gt;</description>
      <pubDate>Sat, 08 Oct 2016 04:14:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167439#M41646</guid>
      <dc:creator>cstanca</dc:creator>
      <dc:date>2016-10-08T04:14:08Z</dc:date>
    </item>
    <item>
      <title>Re: select count(*) fails with tez over cassandra</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167440#M41647</link>
      <description>&lt;P&gt;Yes, It was me who created the ticket.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Oct 2016 15:10:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/select-count-fails-with-tez-over-cassandra/m-p/167440#M41647</guid>
      <dc:creator>jean_jeancarl48</dc:creator>
      <dc:date>2016-10-11T15:10:18Z</dc:date>
    </item>
  </channel>
</rss>

