<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Sqoop incremental: Output directory already exists in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/88040#M2046</link>
    <description>&lt;P&gt;Any solution to this issue?&lt;/P&gt;&lt;P&gt;using --append in place of --lastmodified is not the correct solution as it won't update the record but create new record in hive.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;--delete-target-dir defeats the purpose to update data as it will create new directory everytime which is same as importing entire source table into hdfs-hive everytime.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried using --merge-key but it gives following error:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;19/03/20 07:07:41 ERROR tool.ImportTool: Import failed: java.io.IOException: Could not load jar /tmp/sqoop-gfctwnsg/compile/c63dd58c7ae7aa383d4fe8e795fd8604/FRESH.EMPLOYEERUSHI.jar into JVM. (Could not find class FRESH.EMPLOYEERUSHI.)&lt;BR /&gt;at org.apache.sqoop.util.ClassLoaderStack.addJarFile(ClassLoaderStack.java:92)&lt;BR /&gt;at com.cloudera.sqoop.util.ClassLoaderStack.addJarFile(ClassLoaderStack.java:36)&lt;BR /&gt;at org.apache.sqoop.tool.ImportTool.loadJars(ImportTool.java:120)&lt;BR /&gt;at org.apache.sqoop.tool.ImportTool.lastModifiedMerge(ImportTool.java:456)&lt;BR /&gt;at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:522)&lt;BR /&gt;at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)&lt;BR /&gt;at org.apache.sqoop.Sqoop.run(Sqoop.java:147)&lt;BR /&gt;at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)&lt;BR /&gt;at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)&lt;BR /&gt;at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)&lt;BR /&gt;at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)&lt;BR /&gt;at org.apache.sqoop.Sqoop.main(Sqoop.java:252)&lt;BR /&gt;Caused by: java.lang.ClassNotFoundException: FRESH.EMPLOYEERUSHI&lt;BR /&gt;at java.net.URLClassLoader.findClass(URLClassLoader.java:381)&lt;BR /&gt;at java.lang.ClassLoader.loadClass(ClassLoader.java:424)&lt;BR /&gt;at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814)&lt;BR /&gt;at java.lang.ClassLoader.loadClass(ClassLoader.java:357)&lt;BR /&gt;at java.lang.Class.forName0(Native Method)&lt;BR /&gt;at java.lang.Class.forName(Class.java:348)&lt;BR /&gt;at org.apache.sqoop.util.ClassLoaderStack.addJarFile(ClassLoaderStack.java:88)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My sqoop command is as follows:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;sqoop import \&lt;BR /&gt;--connect "jdbc:oracle:thin:@oraasmwd17-scan.nam.nsroot.net:8889/GENIFRD" \&lt;BR /&gt;--username FRESH \&lt;BR /&gt;--password C1T12016 \&lt;BR /&gt;--table FRESH.EMPLOYEERUSHI \&lt;BR /&gt;--merge-key id \&lt;BR /&gt;--target-dir /data/gfctwnsg/staging/hive/gfctwnsg_staging/rp86813/sqoopimportdir \&lt;BR /&gt;--incremental lastmodified \&lt;BR /&gt;--check-column MODIFIED_DATE \&lt;BR /&gt;--last-value '2019-03-20 06:43:59.0' \&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My source Oracle table is as follows:&lt;BR /&gt;&lt;BR /&gt;1 Rushi Pradhan engineer 30000 18-MAR-19&lt;BR /&gt;2 abc xyz doctor 20000 18-MAR-19&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I changed the salary of id =1 and updated corresponding date manually.&lt;/P&gt;&lt;P&gt;Now I want to reflect this change at hive end also.&lt;/P&gt;&lt;P&gt;But there it is not allowing me to update the record but to only append.&lt;/P&gt;</description>
    <pubDate>Wed, 20 Mar 2019 12:07:36 GMT</pubDate>
    <dc:creator>ruship</dc:creator>
    <dc:date>2019-03-20T12:07:36Z</dc:date>
    <item>
      <title>Sqoop incremental: Output directory already exists</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/13992#M2038</link>
      <description>&lt;P&gt;Hi all!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a seemingly simple use case for Sqoop: incrementally import data from a MySQL db into HDFS. At first I tried Sqoop2, but it seems Sqoop2 doesn't support incremental imports yet. Am I correct in this? (Sqoop2 did imports fine btw)&lt;/P&gt;&lt;P&gt;Then I tried to use Sqoop (1) and figured out I need to create a&amp;nbsp;&lt;EM&gt;job&lt;/EM&gt; so Sqoop can automatically update stuff like the&amp;nbsp;&lt;EM&gt;last value&lt;/EM&gt; for me. This is the command I used to create a job:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT face="andale mono,times"&gt;sqoop job --create import-test -- import --connect jdbc:mysql://10.211.55.1/test_sqoop --username root -P --table test_incr_update --target-dir /user/vagrant/sqooptest --check-column updated --incremental lastmodified&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When I run it for the first time, it works great. When I run it for the second time, I get:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT face="andale mono,times"&gt;ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://vm-cluster-node1:8020/user/vagrant/sqooptest already exists&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I could of course remove the target dir before running the import again, but that would defeat the whole purpose of only getting the newer data and merging it with the old data (for which the old data needs to be present, I assume)!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is this a (known) bug? Or am I doing something wrong? Any help would be appreciated &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:00:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/13992#M2038</guid>
      <dc:creator>Daan</dc:creator>
      <dc:date>2022-09-16T09:00:52Z</dc:date>
    </item>
    <item>
      <title>Re: Sqoop incremental: Output directory already exists</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14036#M2039</link>
      <description>Sqoop2 does not support incremental imports just yet (&lt;A target="_blank" href="https://issues.apache.org/jira/browse/SQOOP-1168)."&gt;https://issues.apache.org/jira/browse/SQOOP-1168).&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;It looks like the command you're running is creating a saved job. Have you tried just executing the saved job (&lt;A target="_blank" href="http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_saved_jobs)?"&gt;http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_saved_jobs)?&lt;/A&gt; Seems like this is achievable via: sqoop job --exec import-test.&lt;BR /&gt;&lt;BR /&gt;You don't need to create a job to perform incremental imports (&lt;A target="_blank" href="http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_incremental_imports)."&gt;http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_incremental_imports).&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;-Abe</description>
      <pubDate>Mon, 23 Jun 2014 18:20:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14036#M2039</guid>
      <dc:creator>abe</dc:creator>
      <dc:date>2014-06-23T18:20:19Z</dc:date>
    </item>
    <item>
      <title>Re: Sqoop incremental: Output directory already exists</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14062#M2040</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/360"&gt;@abe&lt;/a&gt; wrote:&lt;BR /&gt;Sqoop2 does not support incremental imports just yet (&lt;A target="_blank" href="https://issues.apache.org/jira/browse/SQOOP-1168)."&gt;https://issues.apache.org/jira/browse/SQOOP-1168).&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;It looks like the command you're running is creating a saved job. Have you tried just executing the saved job (&lt;A target="_blank" href="http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_saved_jobs)?"&gt;http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_saved_jobs)?&lt;/A&gt; Seems like this is achievable via: sqoop job --exec import-test.&lt;BR /&gt;&lt;BR /&gt;You don't need to create a job to perform incremental imports (&lt;A target="_blank" href="http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_incremental_imports)."&gt;http://sqoop.apache.org/docs/1.4.4/SqoopUserGuide.html#_incremental_imports).&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;-Abe&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;FONT face="andale mono,times"&gt;sqoop job --exec import-test&lt;/FONT&gt; is actually the way I ran the saved job. As said, the first time it runs fine, the second time it complains about the output dir existing already.&lt;/P&gt;&lt;P&gt;The reason I used a saved job for this, is because of the promise that it will keep track of and autofill the&amp;nbsp;&lt;EM&gt;last value&lt;/EM&gt;.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Jun 2014 07:39:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14062#M2040</guid>
      <dc:creator>Daan</dc:creator>
      <dc:date>2014-06-24T07:39:19Z</dc:date>
    </item>
    <item>
      <title>Re: Sqoop incremental: Output directory already exists</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14104#M2041</link>
      <description>Hey there,&lt;BR /&gt;&lt;BR /&gt;It seems you shouldn't have to do this, but adding "--append" to your command might help.&lt;BR /&gt;&lt;BR /&gt;-Abe</description>
      <pubDate>Tue, 24 Jun 2014 17:40:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14104#M2041</guid>
      <dc:creator>abe</dc:creator>
      <dc:date>2014-06-24T17:40:38Z</dc:date>
    </item>
    <item>
      <title>Re: Sqoop incremental: Output directory already exists</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14108#M2042</link>
      <description>By the way, what version of CDH are you using?</description>
      <pubDate>Tue, 24 Jun 2014 17:46:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14108#M2042</guid>
      <dc:creator>abe</dc:creator>
      <dc:date>2014-06-24T17:46:11Z</dc:date>
    </item>
    <item>
      <title>Re: Sqoop incremental: Output directory already exists</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14116#M2043</link>
      <description>Also, it looks like &lt;A target="_blank" href="https://issues.apache.org/jira/browse/SQOOP-1138"&gt;https://issues.apache.org/jira/browse/SQOOP-1138&lt;/A&gt; exists to address this concern.</description>
      <pubDate>Tue, 24 Jun 2014 20:36:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14116#M2043</guid>
      <dc:creator>abe</dc:creator>
      <dc:date>2014-06-24T20:36:48Z</dc:date>
    </item>
    <item>
      <title>Re: Sqoop incremental: Output directory already exists</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14144#M2044</link>
      <description>&lt;P&gt;I'm using CDH 5.0.2 together with Cloudera Manager 5.0.2. I think the SQOOP issue you linked, is exactly the problem I'm having. I shouldn't have to add&amp;nbsp;&lt;EM&gt;--append&amp;nbsp;&lt;/EM&gt;because I'm already using&amp;nbsp;&lt;EM&gt;lastmodified&lt;/EM&gt;, which is the other&amp;nbsp;&lt;EM&gt;incremental&lt;/EM&gt; mode.&lt;/P&gt;&lt;P&gt;As long as SQOOP-1138 isn't fixed, SQOOP will be rather useless to me &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; The only alternative seems to be to export the whole database each time, and replace the old data with the new export.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Jun 2014 07:38:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/14144#M2044</guid>
      <dc:creator>Daan</dc:creator>
      <dc:date>2014-06-25T07:38:48Z</dc:date>
    </item>
    <item>
      <title>Re: Sqoop incremental: Output directory already exists</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/85700#M2045</link>
      <description>&lt;P&gt;You can use "delete-target-dir" , while running sqoop command&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;sqoop import --connect jdbc:mysql://1.2.3.4:1234/retailfgh --username 12345 --password 12345 --table departments&lt;FONT size="7" color="#000000"&gt;&lt;STRONG&gt; --delete-target-dir&amp;nbsp;&lt;/STRONG&gt;&lt;/FONT&gt; --target-dir /poc/sqoop_destination --fields-terminated-by "~"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 31 Jan 2019 06:48:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/85700#M2045</guid>
      <dc:creator>Harkirat</dc:creator>
      <dc:date>2019-01-31T06:48:02Z</dc:date>
    </item>
    <item>
      <title>Re: Sqoop incremental: Output directory already exists</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/88040#M2046</link>
      <description>&lt;P&gt;Any solution to this issue?&lt;/P&gt;&lt;P&gt;using --append in place of --lastmodified is not the correct solution as it won't update the record but create new record in hive.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;--delete-target-dir defeats the purpose to update data as it will create new directory everytime which is same as importing entire source table into hdfs-hive everytime.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried using --merge-key but it gives following error:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;19/03/20 07:07:41 ERROR tool.ImportTool: Import failed: java.io.IOException: Could not load jar /tmp/sqoop-gfctwnsg/compile/c63dd58c7ae7aa383d4fe8e795fd8604/FRESH.EMPLOYEERUSHI.jar into JVM. (Could not find class FRESH.EMPLOYEERUSHI.)&lt;BR /&gt;at org.apache.sqoop.util.ClassLoaderStack.addJarFile(ClassLoaderStack.java:92)&lt;BR /&gt;at com.cloudera.sqoop.util.ClassLoaderStack.addJarFile(ClassLoaderStack.java:36)&lt;BR /&gt;at org.apache.sqoop.tool.ImportTool.loadJars(ImportTool.java:120)&lt;BR /&gt;at org.apache.sqoop.tool.ImportTool.lastModifiedMerge(ImportTool.java:456)&lt;BR /&gt;at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:522)&lt;BR /&gt;at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)&lt;BR /&gt;at org.apache.sqoop.Sqoop.run(Sqoop.java:147)&lt;BR /&gt;at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)&lt;BR /&gt;at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)&lt;BR /&gt;at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)&lt;BR /&gt;at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)&lt;BR /&gt;at org.apache.sqoop.Sqoop.main(Sqoop.java:252)&lt;BR /&gt;Caused by: java.lang.ClassNotFoundException: FRESH.EMPLOYEERUSHI&lt;BR /&gt;at java.net.URLClassLoader.findClass(URLClassLoader.java:381)&lt;BR /&gt;at java.lang.ClassLoader.loadClass(ClassLoader.java:424)&lt;BR /&gt;at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:814)&lt;BR /&gt;at java.lang.ClassLoader.loadClass(ClassLoader.java:357)&lt;BR /&gt;at java.lang.Class.forName0(Native Method)&lt;BR /&gt;at java.lang.Class.forName(Class.java:348)&lt;BR /&gt;at org.apache.sqoop.util.ClassLoaderStack.addJarFile(ClassLoaderStack.java:88)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My sqoop command is as follows:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;sqoop import \&lt;BR /&gt;--connect "jdbc:oracle:thin:@oraasmwd17-scan.nam.nsroot.net:8889/GENIFRD" \&lt;BR /&gt;--username FRESH \&lt;BR /&gt;--password C1T12016 \&lt;BR /&gt;--table FRESH.EMPLOYEERUSHI \&lt;BR /&gt;--merge-key id \&lt;BR /&gt;--target-dir /data/gfctwnsg/staging/hive/gfctwnsg_staging/rp86813/sqoopimportdir \&lt;BR /&gt;--incremental lastmodified \&lt;BR /&gt;--check-column MODIFIED_DATE \&lt;BR /&gt;--last-value '2019-03-20 06:43:59.0' \&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My source Oracle table is as follows:&lt;BR /&gt;&lt;BR /&gt;1 Rushi Pradhan engineer 30000 18-MAR-19&lt;BR /&gt;2 abc xyz doctor 20000 18-MAR-19&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I changed the salary of id =1 and updated corresponding date manually.&lt;/P&gt;&lt;P&gt;Now I want to reflect this change at hive end also.&lt;/P&gt;&lt;P&gt;But there it is not allowing me to update the record but to only append.&lt;/P&gt;</description>
      <pubDate>Wed, 20 Mar 2019 12:07:36 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sqoop-incremental-Output-directory-already-exists/m-p/88040#M2046</guid>
      <dc:creator>ruship</dc:creator>
      <dc:date>2019-03-20T12:07:36Z</dc:date>
    </item>
  </channel>
</rss>

