Support Questions

Find answers, ask questions, and share your expertise

how to download hive data into csv format

avatar
Explorer
I have created tables in hive, now i would like to download those tables in csv format,  i have searched online, so i got these below solutions, but i dont understand how to use these commands on cloudera.
1.
hive -e 'select books from table' | sed 's/[[:space:]]\+/,/g' > /home/lvermeer/temp.csv
 
2.
set hive.io.output.fileformat = CSVTextFile;

INSERT OVERWRITE LOCAL DIRECTORY 'dir_path' SELECT FIELD1, FIELD2, FIELD3 FROM TABLE1;

If anyone knows, kindly help me out.

Thanks in advance.
1 ACCEPTED SOLUTION

avatar
Explorer

I have finally found a way to do that

 

insert overwrite local directory 'Desktop/svk1.csv'
    > row format delimited
    > fields terminated by ','
    > select * from emp;

 

then open the directory, just rename the file with .csv extension.

View solution in original post

4 REPLIES 4

avatar
Champion

if you have hive shell or beeline - you can excute the same code nothing different 

or you can try hue web ui to export the hive results as .csv 

althought hue is not that good in downloading big tables . 

avatar
Contributor

You can try this as well in Unix. 

 

beeline - u 'your jdbc connection string' - - outputformat=csv2 -e "your query here" > output.csv

 

Let me know if it helps.

 

Thanks,

Manu

avatar
Explorer

I have finally found a way to do that

 

insert overwrite local directory 'Desktop/svk1.csv'
    > row format delimited
    > fields terminated by ','
    > select * from emp;

 

then open the directory, just rename the file with .csv extension.

avatar
New Contributor

In case you are doing it from Windows you can use Python script hivehoney to extract table data to local CSV file.

It will:

  • Login to bastion host.
  • pbrun.
  • kinit.
  • beeline (with your query).
  • Save echo from beeline to a file on Windows.

Execute it like this:

set PROXY_HOST=your_bastion_host

set SERVICE_USER=you_func_user

set LINUX_USER=your_SOID

set LINUX_PWD=your_pwd

python hh.py --query_file=query.sql