Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

How does --hive import work

avatar
Expert Contributor

Whenever I use --hive-import argument, I specify a --warehouse-dir as well in my sqoop jobs.

Now, I check my hive tables and data is indeed there but my question is why do I not see any files in warehouse dir using hadoop fs -ls command ?

Sure I do see them when I replace -hive-import and --warehouse-dir with --target-dir .

How does it work? What are the advantages of one over the other?

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Simran Kaur In sqoop --hive import --warehouse directory is the temporary HDFS directory location which collects the imported data finally it moves the data (metadata of files) to hive.warehouse.dir (generally /apps/hive/warehouse- as we specify in our hive-site.xml)

View solution in original post

10 REPLIES 10

avatar
Expert Contributor

@Simran Kaur In sqoop --hive import --warehouse directory is the temporary HDFS directory location which collects the imported data finally it moves the data (metadata of files) to hive.warehouse.dir (generally /apps/hive/warehouse- as we specify in our hive-site.xml)

avatar
Expert Contributor

Awesome.Thank you!

avatar
Expert Contributor

so if I replace --warehouse-dir with --target-dir , it would permanently store files in target-dir location and then I can have my tables mapped to this location as external table? @Dileep Kumar Chiguruvada

avatar
@Simran Kaur

Can you please check the hive table created using describe formatted <hivetablename> and check the location of the hive data?

It seems like data is being written to different directory and with --warehouse-dir not taking effect.

Thanks and Regards,

Sindhu

avatar
Expert Contributor
@Sindhu

You are right. It shows table location as

hdfs://FQDN:8020/user/hive/warehouse/magentodb.db/TABLENAME

avatar
Expert Contributor

Why would it ignore the argument? I tried it with target-dir as well and that did not work either @Sindhu

avatar
Expert Contributor

I believe it is because of the --hive-import argument? I could remove that but I have to use --hive-overwrite argument and I can't use it unless I use --hive-import. @Sindhu . So, how do I use --hive-overwrite while using warehouse-dir /target-dir?

avatar
@Simran Kaur

--target-dir is the while importing table data into HDFS using the Sqoop import tool and might not work with --hive-import.

As @Dileep Kumar Chiguruvada explained earlier, the value of Hive warehouse directory will be picked from hive-site.xml.

Thanks and Regards,

Sindhu

avatar
Expert Contributor

@Sindhu: Got it. But, I do not want the data moved out of warehouse dir /target-dir.Is there a solution for that? or I need to do it separately without the hive import option to keep it in hdfs ?Also, the link suggests using hcatalog: http://grokbase.com/t/sqoop/user/143waxddrr/jira-commented-sqoop-1293-hive-import-causes-target-dir-... .Is it really a solution to the problem?