<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Creating a partitioned table in hive with sqoop in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Creating-a-partitioned-table-in-hive-with-sqoop/m-p/295550#M217806</link>
    <description>&lt;P&gt;Hi guys,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;I want the Hive table created by sqoop to have two partitions:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;One for 2019 and another for 2020.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But for some reason it is putting "2019,2020" in the year column.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Command:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;sqoop import \&lt;BR /&gt;-Dhadoop.security.credential.provider.path=jceks://hdfs/user/lrm0613/mydb2.jceks \&lt;BR /&gt;--connection-manager org.apache.sqoop.manager.SQLServerManager \&lt;BR /&gt;--driver net.sourceforge.jtds.jdbc.Driver \&lt;BR /&gt;--connect 'jdbc:jtds:sqlserver://SQLQP002:1433;useNTLMv2=true;domain=JNL_NT;databaseName=TC31Reporting' \&lt;BR /&gt;--username 'lrm0613' \&lt;BR /&gt;--password-alias sqlserver2.password \&lt;BR /&gt;--query 'select jobhistoryid, requeueid, jobid,name,event,eventtime,submittime,starttime,service,year(EventTime) as year from JobHistory where year(EventTime)=2019 or year(EventTime)=2020 and $CONDITIONS' \&lt;BR /&gt;--hcatalog-database dataengsandbox \&lt;BR /&gt;--hcatalog-table JobHistoryPartitioned \&lt;BR /&gt;--hive-partition-key year \&lt;BR /&gt;--hive-partition-value 2019,2020 \&lt;BR /&gt;--compress \&lt;BR /&gt;--compression-codec snappy \&lt;BR /&gt;--map-column-hive jobid=int \&lt;BR /&gt;-m 1 \&lt;BR /&gt;--create-hcatalog-table \&lt;BR /&gt;--hcatalog-storage-stanza 'stored as parquet'&lt;/P&gt;</description>
    <pubDate>Wed, 06 May 2020 21:15:50 GMT</pubDate>
    <dc:creator>Heri</dc:creator>
    <dc:date>2020-05-06T21:15:50Z</dc:date>
    <item>
      <title>Creating a partitioned table in hive with sqoop</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-a-partitioned-table-in-hive-with-sqoop/m-p/295550#M217806</link>
      <description>&lt;P&gt;Hi guys,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;I want the Hive table created by sqoop to have two partitions:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;One for 2019 and another for 2020.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But for some reason it is putting "2019,2020" in the year column.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Command:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;sqoop import \&lt;BR /&gt;-Dhadoop.security.credential.provider.path=jceks://hdfs/user/lrm0613/mydb2.jceks \&lt;BR /&gt;--connection-manager org.apache.sqoop.manager.SQLServerManager \&lt;BR /&gt;--driver net.sourceforge.jtds.jdbc.Driver \&lt;BR /&gt;--connect 'jdbc:jtds:sqlserver://SQLQP002:1433;useNTLMv2=true;domain=JNL_NT;databaseName=TC31Reporting' \&lt;BR /&gt;--username 'lrm0613' \&lt;BR /&gt;--password-alias sqlserver2.password \&lt;BR /&gt;--query 'select jobhistoryid, requeueid, jobid,name,event,eventtime,submittime,starttime,service,year(EventTime) as year from JobHistory where year(EventTime)=2019 or year(EventTime)=2020 and $CONDITIONS' \&lt;BR /&gt;--hcatalog-database dataengsandbox \&lt;BR /&gt;--hcatalog-table JobHistoryPartitioned \&lt;BR /&gt;--hive-partition-key year \&lt;BR /&gt;--hive-partition-value 2019,2020 \&lt;BR /&gt;--compress \&lt;BR /&gt;--compression-codec snappy \&lt;BR /&gt;--map-column-hive jobid=int \&lt;BR /&gt;-m 1 \&lt;BR /&gt;--create-hcatalog-table \&lt;BR /&gt;--hcatalog-storage-stanza 'stored as parquet'&lt;/P&gt;</description>
      <pubDate>Wed, 06 May 2020 21:15:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-a-partitioned-table-in-hive-with-sqoop/m-p/295550#M217806</guid>
      <dc:creator>Heri</dc:creator>
      <dc:date>2020-05-06T21:15:50Z</dc:date>
    </item>
    <item>
      <title>Re: Creating a partitioned table in hive with sqoop</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Creating-a-partitioned-table-in-hive-with-sqoop/m-p/296488#M218249</link>
      <description>&lt;P&gt;Sqoop can only insert into a single Hive partition at one time. To accomplish what you are trying to do, you can have two separate sqoop commands:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;sqoop with --query ... where&amp;nbsp;&lt;SPAN&gt;year(EventTime)=2019 &lt;STRONG&gt;(remove&amp;nbsp;year(EventTime)=2020)&amp;nbsp;&lt;/STRONG&gt;and set&amp;nbsp;--hive-partition-value 2019 &lt;STRONG&gt;(not 2020)&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;sqoop with --query ... where&amp;nbsp;&lt;SPAN&gt;year(EventTime)=2020&amp;nbsp;&lt;STRONG&gt;(remove&amp;nbsp;year(EventTime)=2019)&amp;nbsp;&lt;/STRONG&gt;and set&amp;nbsp;--hive-partition-value 2020&amp;nbsp;&lt;STRONG&gt;(not 2019)&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;This way sqoop will write into the one partition you want. Since this is one-time import, the solution should work just fine. Let me know if this works and accept the answer if it makes sense.&lt;/P&gt;</description>
      <pubDate>Fri, 22 May 2020 19:51:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Creating-a-partitioned-table-in-hive-with-sqoop/m-p/296488#M218249</guid>
      <dc:creator>aakulov</dc:creator>
      <dc:date>2020-05-22T19:51:00Z</dc:date>
    </item>
  </channel>
</rss>

