Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

sqoop insert into partitioned table​

Explorer

Hi, I am looking to run 2 sqoop command

1. To pull the entire content of a table and insert into an Hive partitioned table using sqoop

sqoop import --connect jdbc:oracle:thin:@//jfadboc1.jfa.unibet.com:1521/xxx --username xxx --password xxx --table DW_FACT_PUNTER_TEMP --split-by TIME_KEY --target-dir unica/data/DW_FACT_PUNTER_TEST  --hive-import --hive-overwrite --hive-drop-import-delims --null-non-string '\\N' --null-string '\\N' --hive-table unica.DW_FACT_PUNTER_TEST_TEMP --hive-partition-key "TIME_YEAR", "TIME_MONTH","TIME_DAY"  

2. Do incremental load in sqoop and insert into the partitioned table

sqoop import --connect jdbc:oracle:thin:@//jfadboc1.jfa.unibet.com:1521/xxx --username 

xxx --password xxx --query "SELECT TIME_KEY,PUNTER_KEY,PRODUCT_KEY,INDICATOR_KEY,INDICATOR_VALUE,INSERT_TIME FROM DW_FACT_PUNTER_TEST WHERE \$CONDITIONS AND (TIME_KEY >=20010101)" --split-by TIME_KEY --target-dir unica/data/FACT_PUNTER_IUD_UNICA_INCR --hive-import --hive-overwrite --hive-drop-import-delims --null-non-string '\\N' --null-string '\\N' --hive-table unica.DW_FACT_PUNTER_TEST_TEMP --hive-partition-key "TIME_YEAR"

My table structure is below

CREATE TABLE IF NOT EXISTS DW_FACT_PUNTER_TEST_TEMP
(PUNTER_KEY DECIMAL(36,2), 
PRODUCT_KEY DECIMAL(36,2), 
INDICATOR_KEY DECIMAL(36,2), 
INDICATOR_VALUE DECIMAL(36,2), 
INSERT_TIME STRING) 
PARTITIONED BY(TIME_YEAR STRING, TIME_MONTH STRING,TIME_DAY STRING);

Both these sqoop commands are not working.

Can you please help me understand how I can use --hive-partition-key for mass import and where there are more than 1 columns used to partition a table.

1 ACCEPTED SOLUTION

Rising Star

@Parinita Kothari To import into Multi-key partitioned Hive table , you can make use of --hcatalog-table

-table <tblname> --hcatalog-table <tblname> --hcatalog-partition-keys k1,k2 --hcatalog-partition-values 'v1,v2'

eg: -table DW_FACT_PUNTER_TEST_TEMP --hcatalog-table DW_FACT_PUNTER_TEST_TEMP --hcatalog-partition-keys TIME_YEAR,TIME_MONTH --hcatalog-partition-values '2014-04-11,2014-APR'

View solution in original post

4 REPLIES 4

Rising Star

@Parinita Kothari To import into Multi-key partitioned Hive table , you can make use of --hcatalog-table

-table <tblname> --hcatalog-table <tblname> --hcatalog-partition-keys k1,k2 --hcatalog-partition-values 'v1,v2'

eg: -table DW_FACT_PUNTER_TEST_TEMP --hcatalog-table DW_FACT_PUNTER_TEST_TEMP --hcatalog-partition-keys TIME_YEAR,TIME_MONTH --hcatalog-partition-values '2014-04-11,2014-APR'

Explorer

Thank you @Dileep Kumar Chiguruvada for your quick response.

Is it neccessary that I need to hardcode the value for --hcatalog-partition-values?

The table I am pulling data from (in oracle) has data since 2001-01-01 and it could be nice to be able to use dynamic partitions here.

Rising Star

@Parinita Kothari Yes you can do that too..

Create a Hive paritioned table and set hive.exec.dynamic.partition.mode=nonstrict in a case of dynamic
-Import table from DB to the partitioned table

here you need not specificy partition-keys and partition-values , Just -table <tblname> --hcatalog-table <tblname> is enough..

Explorer

Thank you @Dileep Kumar Chiguruvada it works now. I had to use sqoop and import the contents into a temp table ( which wasn't partitioned) and after use this temp table to insert into the actual partitioned tables. Couldn't really find a direct way to ingest data directly into a partitioned table which has more than 1 columns which are partitioned using sqoop.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.