Created 07-01-2016 06:04 AM
Whenever I use --hive-import argument, I specify a --warehouse-dir as well in my sqoop jobs.
Now, I check my hive tables and data is indeed there but my question is why do I not see any files in warehouse dir using hadoop fs -ls command ?
Sure I do see them when I replace -hive-import and --warehouse-dir with --target-dir .
How does it work? What are the advantages of one over the other?
Created 07-01-2016 06:31 AM
@Simran Kaur In sqoop --hive import --warehouse directory is the temporary HDFS directory location which collects the imported data finally it moves the data (metadata of files) to hive.warehouse.dir (generally /apps/hive/warehouse- as we specify in our hive-site.xml)
Created 07-01-2016 06:31 AM
@Simran Kaur In sqoop --hive import --warehouse directory is the temporary HDFS directory location which collects the imported data finally it moves the data (metadata of files) to hive.warehouse.dir (generally /apps/hive/warehouse- as we specify in our hive-site.xml)
Created 07-01-2016 06:42 AM
Awesome.Thank you!
Created 07-01-2016 07:00 AM
so if I replace --warehouse-dir with --target-dir , it would permanently store files in target-dir location and then I can have my tables mapped to this location as external table? @Dileep Kumar Chiguruvada
Created 07-01-2016 06:49 AM
Can you please check the hive table created using describe formatted <hivetablename> and check the location of the hive data?
It seems like data is being written to different directory and with --warehouse-dir not taking effect.
Thanks and Regards,
Sindhu
Created 07-01-2016 07:15 AM
You are right. It shows table location as
hdfs://FQDN:8020/user/hive/warehouse/magentodb.db/TABLENAME |
Created 07-01-2016 07:16 AM
Why would it ignore the argument? I tried it with target-dir as well and that did not work either @Sindhu
Created 07-01-2016 07:19 AM
I believe it is because of the --hive-import argument? I could remove that but I have to use --hive-overwrite argument and I can't use it unless I use --hive-import. @Sindhu . So, how do I use --hive-overwrite while using warehouse-dir /target-dir?
Created 07-01-2016 07:38 AM
--target-dir is the while importing table data into HDFS using the Sqoop import tool and might not work with --hive-import.
As @Dileep Kumar Chiguruvada explained earlier, the value of Hive warehouse directory will be picked from hive-site.xml.
Thanks and Regards,
Sindhu
Created 07-01-2016 07:41 AM
@Sindhu: Got it. But, I do not want the data moved out of warehouse dir /target-dir.Is there a solution for that? or I need to do it separately without the hive import option to keep it in hdfs ?Also, the link suggests using hcatalog: http://grokbase.com/t/sqoop/user/143waxddrr/jira-commented-sqoop-1293-hive-import-causes-target-dir-... .Is it really a solution to the problem?