Created 01-17-2018 07:20 AM
I read from sqoop documentation
http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_hcatalog_background
The Sqoop HCatalog feature supports the following table types:
But I am not getting how to use dynamic partition using sqoop.
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password-file file:///home/hdfs/.password --split-by id --table test --delete-target-dir --fields-terminated-by "," --hcatalog-table test --hcatalog-storage-stanza "stored as orcfile" --create-hcatalog-table --hcatalog-partition-keys created_at --map-column-hive created_at=date,updated_at=date
Getting This error
Either both --hcatalog-partition-keys and --hcatalog-partition-values should be provided or both of these options should be omitted.
and if I use the partition value then it assigns it as static partition
sqoop import --connect jdbc:mysql://localhost:3306/test --username root --password-file file:///home/hdfs/.password --split-by id --table test --delete-target-dir --fields-terminated-by "," --hcatalog-table test --hcatalog-storage-stanza "stored as orcfile" --create-hcatalog-table --hcatalog-partition-keys created_at --hcatalog-partition-value created_at --map-column-hive created_at=date,updated_at=date
after giving created_at as hcatalog-partition-value it created the directory like apps/hive/warehouse/test/created_at=created_at
it should have used created_at column to create dynamic partition based on data.
Please help me fix this issue
Created 09-30-2019 07:00 PM
Hi,
You can try like below example.
Data in source
###########
[root@host-10-17-103-77 ~]# sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "create table test_partition(id int,name varchar(30),par varchar(20))"
[root@host-10-17-103-77 ~]# sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into test_partition values(1,'nitish','par1')"
[root@host-10-17-103-77 ~]# sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into test_partition values(2,'mohit','par2')"
[root@host-10-17-103-77 ~]# sqoop eval --connect $MYCONN --username $MYUSER --password $MYPSWD --query "insert into test_partition values(3,'mohit123','par3')"
Table in Hive:-
############
0: jdbc:hive2://host-10-17-103-79.coe.clouder> create external table test_sqoop_par(id int,name string) partitioned by (par string) row format delimited fields terminated by '\t' location '/user/systest/test_sqoop_par';
NOTE:- partition column as string is only supported.
Sqoop command:-
##############
[root@host-10-17-103-77 ~]# sqoop import --connect $MYCONN --username $MYUSER --password $MYPSWD --hcatalog-database default --hcatalog-table test_sqoop_par --table TEST_PARTITION -m1 --hive-partition-value par
Data in Hive:-
##########
0: jdbc:hive2://host-10-17-103-79.coe.clouder> show partitions test_sqoop_par;
+------------+
| partition |
+------------+
| par=par1 |
| par=par2 |
| par=par3 |
+------------+
[root@host-10-17-103-77 ~]# hadoop fs -ls /user/systest/test_sqoop_par
Found 3 items
drwxr-xr-x - systest supergroup 0 2019-09-30 18:49 /user/systest/test_sqoop_par/par=par1
drwxr-xr-x - systest supergroup 0 2019-09-30 18:49 /user/systest/test_sqoop_par/par=par2
drwxr-xr-x - systest supergroup 0 2019-09-30 18:49 /user/systest/test_sqoop_par/par=par3
Hope above helps.
Regards
Nitish
Created 06-12-2020 10:24 AM
Hi to all
I'm looking for a solution to import and to partition dynamically data into ORC hive tables.
I've seent that with sqoop it is possibile di import and to partition data dynamically by the value of columns, but I've also seen that this feature is working only for the first run, because in the following job launchs I get an 'org.apache.hive.hcatalog.common.HCatException : 2002 : Partition already present with given partition key values' message since sqoop is aptempting toappend data to existing partitions.
Any idea to resolve this issue that make impossibile to use dynamic partioning in sqoop?