Support Questions
Find answers, ask questions, and share your expertise

how to use Sqoop to export data from Hive to a flat file to be picked up by Hive again?

Can anyone let me know how to use Sqoop to export data from Hive to a flat file to be picked up by Hive again?

I understand that Sqoop can export data to tables, however I have a scenario where I want to export data into flat files

1 ACCEPTED SOLUTION

Accepted Solutions

Thanks Artem and Neeraj. I just copied the data from /apps/hive/warehouse to a local directory.

Note: Hive data is not necessary located in the folder mentioned above. i.e. external table can be defined anywhere within the HDFS. hive –e “DESCRIBE FORMATTED <tablename>” > <tablename_descformat>.txt;

This command is to collect detail information of tables. The detail information for a table includes column, partition, database it belongs, owner, create time, last access time, protect mode, location, table type, storage information and etc.

View solution in original post

8 REPLIES 8

Mentor

unless you're using orc, the files you use for Hive are in their raw form. You can just browse to /apps/hive/warehouse dir and look at them. No need to use sqoop.

Thanks for reply. I'm using orc format

Mentor

@vijaya inturi you can also read orc and output to flat file using pig

A = load ‘student.orc’ using OrcStorage();

Store A into 'file' using PigStorage([options]);

@vijaya inturi

You can use this to write into HDFS from existing table

INSERT OVERWRITE DIRECTORY '/path/to/output/dir' SELECT * FROM table

Thanks Neeraj. Is there any command to export all the data from the hive database into a flat file?

Thanks Artem and Neeraj. I just copied the data from /apps/hive/warehouse to a local directory.

Note: Hive data is not necessary located in the folder mentioned above. i.e. external table can be defined anywhere within the HDFS. hive –e “DESCRIBE FORMATTED <tablename>” > <tablename_descformat>.txt;

This command is to collect detail information of tables. The detail information for a table includes column, partition, database it belongs, owner, create time, last access time, protect mode, location, table type, storage information and etc.

View solution in original post

Mentor

yes correct, but if table is already external, then I don't see a point in need to sqoop data out of Hive as you can just work with those files directly.

Mentor

also as a good practice, since you achieved the intended result, please accept one of the answers you see fit to close out the thread.