Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎06-09-2015

sqoop and --hive-import how to define destination folder for hive/warehouse

When using sqoop import with the --hive-import switch as well as --warehouse-dir switch, I am expecting the files to end up in subdirs under the --warehouse-dir location.  However, they do not, instead landing under /user/hive/database/...

 

Here's an example sqoop command:

 

sqoop import --connect "jdbc:sqlserver://10.0.0.10:1433;database=ENP_DATAWAREHOUSE_PDB" --username foo --password bar --table Job --compression-codec snappy --num-mappers 20 --hive-import --hive-database staging --create-hive-table --verbose --hive-overwrite --hive-drop-import-delims  --warehouse-dir /enp/svc/data/staging

 

And in the output, I even see this line:

 

15/06/09 10:43:27 DEBUG hive.TableDefWriter: Load statement: LOAD DATA INPATH 'hdfs://nameservice1/enp/svc/data/staging/Job' OVERWRITE INTO TABLE `staging`.`Job`

 

 

In the end the table works, it's just not where I want it.  Am I misunderstanding how to do this?

 

Thanks.

 

Michael

 

Cloudera Employee abe
Cloudera Employee
Posts: 109
Registered: ‎08-08-2013

Re: sqoop and --hive-import how to define destination folder for hive/warehouse

I believe this option controls where the data is written in HDFS before it's loaded into Hive. When the data is loaded into Hive, it will be moved again.

For details on where it's loaded into HDFS before it's loaded into Hive, see http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_connecting_to_a_database_server