<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question What happens if one of the Spark task fails while inserting data into Hive in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-happens-if-one-of-the-Spark-task-fails-while-inserting/m-p/136093#M47896</link>
    <description>&lt;P&gt;I came across a situation when inserting data into hive table from another table. The query was processed using two MR jobs. one got successful and another failed. I could see, few records are inserted into the target table. It was obvious to me since there were two MR jobs processed independently and  it is not transactional based.&lt;/P&gt;&lt;P&gt;I am trying to understand what happens if the same occurs while inserting data into Hive using Spark. If one of the executor/task fails and it reached retry limit, will it completely terminate the job or partial data get inserted into the table?&lt;/P&gt;&lt;P&gt; Thanks in advance.&lt;/P&gt;</description>
    <pubDate>Sun, 04 Dec 2016 01:29:36 GMT</pubDate>
    <dc:creator>antonyshajin</dc:creator>
    <dc:date>2016-12-04T01:29:36Z</dc:date>
    <item>
      <title>What happens if one of the Spark task fails while inserting data into Hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-happens-if-one-of-the-Spark-task-fails-while-inserting/m-p/136093#M47896</link>
      <description>&lt;P&gt;I came across a situation when inserting data into hive table from another table. The query was processed using two MR jobs. one got successful and another failed. I could see, few records are inserted into the target table. It was obvious to me since there were two MR jobs processed independently and  it is not transactional based.&lt;/P&gt;&lt;P&gt;I am trying to understand what happens if the same occurs while inserting data into Hive using Spark. If one of the executor/task fails and it reached retry limit, will it completely terminate the job or partial data get inserted into the table?&lt;/P&gt;&lt;P&gt; Thanks in advance.&lt;/P&gt;</description>
      <pubDate>Sun, 04 Dec 2016 01:29:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-happens-if-one-of-the-Spark-task-fails-while-inserting/m-p/136093#M47896</guid>
      <dc:creator>antonyshajin</dc:creator>
      <dc:date>2016-12-04T01:29:36Z</dc:date>
    </item>
    <item>
      <title>Re: What happens if one of the Spark task fails while inserting data into Hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-happens-if-one-of-the-Spark-task-fails-while-inserting/m-p/136094#M47897</link>
      <description>&lt;P&gt;so what I understand your problem is your hive insert query spin two stages processed with 2 MR job in which last job failed result into the inconsistent data into the destination table. spark job also consist of  stages but there is lineage in stages so if one of stage got failed after retrying executor retried attempt then your complete job will fail.&lt;/P&gt;</description>
      <pubDate>Sun, 04 Dec 2016 01:35:59 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-happens-if-one-of-the-Spark-task-fails-while-inserting/m-p/136094#M47897</guid>
      <dc:creator>rajkumar_singh</dc:creator>
      <dc:date>2016-12-04T01:35:59Z</dc:date>
    </item>
    <item>
      <title>Re: What happens if one of the Spark task fails while inserting data into Hive</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-happens-if-one-of-the-Spark-task-fails-while-inserting/m-p/136095#M47898</link>
      <description>&lt;P&gt;Thanks @&lt;A href="https://community.hortonworks.com/users/8919/rajkumarsingh.html"&gt;Rajkumar Singh&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 04 Dec 2016 01:40:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/What-happens-if-one-of-the-Spark-task-fails-while-inserting/m-p/136095#M47898</guid>
      <dc:creator>antonyshajin</dc:creator>
      <dc:date>2016-12-04T01:40:05Z</dc:date>
    </item>
  </channel>
</rss>

