<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: 5th attempt at getting an answer to this question in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202299#M83741</link>
    <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/10115/sahmad43.html"&gt;Sami Ahmad&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I doubt this is a conspiracy :). I found one instance of those 4th previous attempts: &lt;A href="https://community.hortonworks.com/questions/131541/is-sqoop-incremental-load-possible-for-hive-orc-ta.html" target="_blank"&gt;https://community.hortonworks.com/questions/131541/is-sqoop-incremental-load-possible-for-hive-orc-ta.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;It would have been good if you could have referenced the URLs of the previous 4 attempt so we can get some historical information. It is not clear what version of Hive you use, 1.2.1 or 2.1.0, also how whether you created the target Hive table as transactional, but anyway, long story short, the following is the practice that Hortonworks recommends on HDP 2.6.0, assuming that is your HDP version as your question was tagged: &lt;A href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_data-access/content/incrementally-updating-hive-table-with-sqoop-and-ext-table.html" target="_blank"&gt;https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_data-access/content/incrementally-updating-hive-table-with-sqoop-and-ext-table.html&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 22 Sep 2018 04:09:36 GMT</pubDate>
    <dc:creator>cstanca</dc:creator>
    <dc:date>2018-09-22T04:09:36Z</dc:date>
    <item>
      <title>5th attempt at getting an answer to this question</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202298#M83740</link>
      <description>&lt;P&gt;I have a strong feeling Hortonworks is purposely not answering it  as I got all my other questions answered ..so I m hoping one brave person will step forward and tell me the truth . after all that's what the purpose of the forum is.&lt;/P&gt;&lt;P&gt;my question that is still not answered is :  is sqoop incremental load in hive ORC table supported and has anyone done it ? &lt;/P&gt;&lt;P&gt;i am sure many people will benefit from this answer &lt;/P&gt;</description>
      <pubDate>Sat, 22 Sep 2018 03:17:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202298#M83740</guid>
      <dc:creator>aliyesami</dc:creator>
      <dc:date>2018-09-22T03:17:46Z</dc:date>
    </item>
    <item>
      <title>Re: 5th attempt at getting an answer to this question</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202299#M83741</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/10115/sahmad43.html"&gt;Sami Ahmad&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I doubt this is a conspiracy :). I found one instance of those 4th previous attempts: &lt;A href="https://community.hortonworks.com/questions/131541/is-sqoop-incremental-load-possible-for-hive-orc-ta.html" target="_blank"&gt;https://community.hortonworks.com/questions/131541/is-sqoop-incremental-load-possible-for-hive-orc-ta.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;It would have been good if you could have referenced the URLs of the previous 4 attempt so we can get some historical information. It is not clear what version of Hive you use, 1.2.1 or 2.1.0, also how whether you created the target Hive table as transactional, but anyway, long story short, the following is the practice that Hortonworks recommends on HDP 2.6.0, assuming that is your HDP version as your question was tagged: &lt;A href="https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_data-access/content/incrementally-updating-hive-table-with-sqoop-and-ext-table.html" target="_blank"&gt;https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_data-access/content/incrementally-updating-hive-table-with-sqoop-and-ext-table.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 22 Sep 2018 04:09:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202299#M83741</guid>
      <dc:creator>cstanca</dc:creator>
      <dc:date>2018-09-22T04:09:36Z</dc:date>
    </item>
    <item>
      <title>Re: 5th attempt at getting an answer to this question</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202300#M83742</link>
      <description>&lt;P&gt;it must be my lucky day &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Stanca but this will introduce a lot of delays in the data , we wanted a near real time data , is it not possible using sqoop ?&lt;/P&gt;&lt;P&gt;what about Nifi ?&lt;/P&gt;&lt;P&gt;Also i use "--hcatalog-storage-stanza 'stored as orc tblproperties ("orc.compress"="SNAPPY")'"  for non incremental loads and i was told that soon this would be working  , still not ?&lt;/P&gt;</description>
      <pubDate>Sat, 22 Sep 2018 04:25:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202300#M83742</guid>
      <dc:creator>aliyesami</dc:creator>
      <dc:date>2018-09-22T04:25:05Z</dc:date>
    </item>
    <item>
      <title>Re: 5th attempt at getting an answer to this question</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202301#M83743</link>
      <description>&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/10115/sahmad43.html"&gt;Sami Ahmad&lt;/A&gt;&lt;/P&gt;&lt;P&gt;1) NiFi is definitely an option.&lt;/P&gt;&lt;P&gt;If CDC is important for you, be aware that MySQL CDC processor is supported. Unfortunately, other supported CDC processors are not available due to licensing issues with vendors like Oracle etc. So, if you use NiFi, you need to write your queries smartly to catch the changes and limit the impact on source databases. &lt;/P&gt;&lt;P&gt;2) Another good option is Attunity, but that comes at higher cost.&lt;/P&gt;&lt;P&gt;3) I have seen others using Spark for your use case.&lt;/P&gt;&lt;P&gt;4) I am doing some guesswork here, because I don't have info from your 4 questions before. As I recall the incremental data import is supported via sqoop job and not directly via sqoop import. I see that you did it as a job, but could be it a typo in your syntax? I see a space between "dash-dash" and "import" &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; Joking, but you may want to check. I have seen strange messages out of Sqoop.&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/131541/is-sqoop-incremental-load-possible-for-hive-orc-ta.html" target="_blank"&gt;https://community.hortonworks.com/questions/131541/is-sqoop-incremental-load-possible-for-hive-orc-ta.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt; Could you point to previous questions URLs or clarify/add more info to the current question?&lt;/P&gt;&lt;P&gt;5) Regarding, "Also i use "--hcatalog-storage-stanza 'stored as orc tblproperties ("orc.compress"="SNAPPY")'" for non incremental loads and i was told that soon this would be working , still not ?" I need a little bit more context or maybe this was already another question you submitted and I can check.&lt;/P&gt;</description>
      <pubDate>Sat, 22 Sep 2018 04:54:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202301#M83743</guid>
      <dc:creator>cstanca</dc:creator>
      <dc:date>2018-09-22T04:54:55Z</dc:date>
    </item>
    <item>
      <title>Re: 5th attempt at getting an answer to this question</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202302#M83744</link>
      <description>&lt;P&gt;here is one of the post showing  hive incremental import into ORC is possible using sqoop . so why its not working for me ?  I was using the correct syntax  . no space between dash-dash &lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/58015/sqoop-hcataloghive-incremental-import-in-orc-forma.html"&gt;https://community.hortonworks.com/questions/58015/sqoop-hcataloghive-incremental-import-in-orc-forma.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;kindly check my other post ,,  I am trying to follow the link you posted earlier but getting errors.&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/223262/malformed-orc-file-format.html"&gt;https://community.hortonworks.com/questions/223262/malformed-orc-file-format.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 22 Sep 2018 08:18:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202302#M83744</guid>
      <dc:creator>aliyesami</dc:creator>
      <dc:date>2018-09-22T08:18:31Z</dc:date>
    </item>
    <item>
      <title>Re: 5th attempt at getting an answer to this question</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202303#M83745</link>
      <description>&lt;P&gt;please see the syntax in the attached screenshot ..  its not complaining about dash-dash but its not liking the --append-mode with HCatalog &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="91527-capture1.jpg" style="width: 1674px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/17363iE2583C068F9EA777/image-size/medium?v=v2&amp;amp;px=400" role="button" title="91527-capture1.jpg" alt="91527-capture1.jpg" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 05:21:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202303#M83745</guid>
      <dc:creator>aliyesami</dc:creator>
      <dc:date>2019-08-18T05:21:56Z</dc:date>
    </item>
    <item>
      <title>Re: 5th attempt at getting an answer to this question</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202304#M83746</link>
      <description>&lt;P&gt;I read your DATA ACCESS document and I think its for cases where you don't have a date check column.  since its doing many operations like merging tables ,  purging , compacting , deleting  ..etc &lt;/P&gt;&lt;P&gt;why would I do all this when I can just  &lt;/P&gt;&lt;P&gt;1- import the whole base table as ORC  &lt;/P&gt;&lt;P&gt;2-  bring in the incrementals as text  to an exterenally mapped table &lt;/P&gt;&lt;P&gt;3- insert into the base ORC table selecting everything from the incremental table &lt;/P&gt;&lt;P&gt;4- delete all the files in the external table folder.  &lt;/P&gt;&lt;P&gt;I tested this method and its working fine .&lt;/P&gt;&lt;P&gt;is there any flaw in this method that I am not seeing ?&lt;/P&gt;</description>
      <pubDate>Sat, 22 Sep 2018 11:36:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202304#M83746</guid>
      <dc:creator>aliyesami</dc:creator>
      <dc:date>2018-09-22T11:36:46Z</dc:date>
    </item>
    <item>
      <title>Re: 5th attempt at getting an answer to this question</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202305#M83747</link>
      <description>&lt;P&gt;I just noted there is a small note on top saying &lt;/P&gt;&lt;P&gt;"&lt;/P&gt;&lt;P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;IMG alt="[Note&amp;gt;" src="https://ip1.i.lithium.com/43e5f9d7d5a0de68e75338a220a8232d6b536d09/68747470733a2f2f646f63732e686f72746f6e776f726b732e636f6d2f484450446f63756d656e74732f484450322f4844502d322e362e302f626b5f646174612d6163636573732f636f6d6d6f6e2f696d616765732f61646d6f6e2f6e6f74652e706e67" /&gt;&lt;/TD&gt;&lt;TH&gt;Note&lt;/TH&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;This procedure requires change data capture from the operational database that has a primary key and modified date field where you pulled the         records from since the last update.&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;/P&gt;&lt;P&gt;we don't have CDC on our database so we cant do incremental imports?  it should be possible by looking at the date field as that's constantly increasing ? &lt;/P&gt;</description>
      <pubDate>Sun, 23 Sep 2018 09:49:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/5th-attempt-at-getting-an-answer-to-this-question/m-p/202305#M83747</guid>
      <dc:creator>aliyesami</dc:creator>
      <dc:date>2018-09-23T09:49:06Z</dc:date>
    </item>
  </channel>
</rss>

