Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

how to use Sqoop to export data from Hive to a flat file to be picked up by Hive again?

avatar
Contributor

Can anyone let me know how to use Sqoop to export data from Hive to a flat file to be picked up by Hive again?

I understand that Sqoop can export data to tables, however I have a scenario where I want to export data into flat files

1 ACCEPTED SOLUTION

avatar
Contributor

Thanks Artem and Neeraj. I just copied the data from /apps/hive/warehouse to a local directory.

Note: Hive data is not necessary located in the folder mentioned above. i.e. external table can be defined anywhere within the HDFS. hive –e “DESCRIBE FORMATTED <tablename>” > <tablename_descformat>.txt;

This command is to collect detail information of tables. The detail information for a table includes column, partition, database it belongs, owner, create time, last access time, protect mode, location, table type, storage information and etc.

View solution in original post

8 REPLIES 8

avatar
Master Mentor

unless you're using orc, the files you use for Hive are in their raw form. You can just browse to /apps/hive/warehouse dir and look at them. No need to use sqoop.

avatar
Contributor

Thanks for reply. I'm using orc format

avatar
Master Mentor

@vijaya inturi you can also read orc and output to flat file using pig

A = load ‘student.orc’ using OrcStorage();

Store A into 'file' using PigStorage([options]);

avatar
Master Mentor
@vijaya inturi

You can use this to write into HDFS from existing table

INSERT OVERWRITE DIRECTORY '/path/to/output/dir' SELECT * FROM table

avatar
Contributor

Thanks Neeraj. Is there any command to export all the data from the hive database into a flat file?

avatar
Contributor

Thanks Artem and Neeraj. I just copied the data from /apps/hive/warehouse to a local directory.

Note: Hive data is not necessary located in the folder mentioned above. i.e. external table can be defined anywhere within the HDFS. hive –e “DESCRIBE FORMATTED <tablename>” > <tablename_descformat>.txt;

This command is to collect detail information of tables. The detail information for a table includes column, partition, database it belongs, owner, create time, last access time, protect mode, location, table type, storage information and etc.

avatar
Master Mentor

yes correct, but if table is already external, then I don't see a point in need to sqoop data out of Hive as you can just work with those files directly.

avatar
Master Mentor

also as a good practice, since you achieved the intended result, please accept one of the answers you see fit to close out the thread.