Created 04-04-2016 11:54 AM
Hi,
I am trying to dump the output of a beeline query (below) to a file but it prints all the logs along with the output in the file.
beeline -u jdbc:hive2://somehost_ip/ -f hive.hql >op.txt
Here is the output
0: jdbc:hive2://10.211.1.5:10000/> use db; 0: jdbc:hive2://10.211.1.5:10000/> select count(*) from sample_table; +------+--+ | _c0 | +------+--+ | 131 | +------+--+ 0: jdbc:hive2://10.211.1.5:10000/> 0: jdbc:hive2://10.211.1.5:10000/> 0: jdbc:hive2://10.211.1.5:10000/>
Can someone let me know how to get the output alone in a file as we get using Hive?
Created 04-04-2016 01:51 PM
You have different options.
1) You can control up to a point how the beeline output is made and then just save it to a file with linux. For example
beeline --outputformat=csv2 xxx > output.csv ( see the relevant parameters from the beeline help below )
2) For more control and better performance I wrote a little Java tool once. Its really only a couple lines of jdbc code.
3) and finally as Ana wrote. Yopu can just write a table into an external table in HDFS and specify the output format you want.
Like
create external table test ROW FORMAT delimited fields terminated by '|' location "/tmp/myfolder" as select * from mytable;
you can then get that output in the local file system with
hadoop fs -getmerge /tmp/myfolder myoutput.csv
--showHeader=[true/false] show column names in query results --headerInterval=ROWS; the interval between which heades are displayed --fastConnect=[true/false] skip building table/column list for tab-completion --autoCommit=[true/false] enable/disable automatic transaction commit --verbose=[true/false] show verbose error messages and debug info --showWarnings=[true/false] display connection warnings --showNestedErrs=[true/false] display nested errors --numberFormat=[pattern] format numbers using DecimalFormat pattern --force=[true/false] continue running script even after errors --maxWidth=MAXWIDTH the maximum width of the terminal --maxColumnWidth=MAXCOLWIDTH the maximum width to use when displaying columns --silent=[true/false] be more silent --autosave=[true/false] automatically save preferences --outputformat=[table/vertical/csv2/tsv2/dsv/csv/tsv] format mode for result display Note that csv, and tsv are deprecated - use csv2, tsv2 inste
Created 05-08-2017 07:42 AM
Hi 苏 斌,
I have not tried INSERT OVERWRITE DIRECTORY when i switch to beeline. Try to use this command:-
beeline -u jdbc:hive2://somehost_ip/ --silent=true --outputformat=csv2 -f hive.hql >op.csv
Created 10-20-2018 01:51 AM
Hi Team,
I am trying to print output along with number records selected or number record affected messages in the output file but I am unable to print it.
I have tried with --verbose=true also but it is not helping.
<p>
beeline -u "jdbc:hive2://abc.nsroot.net:10000/;principal=hive/abc.nsroot.net@ABC.NSROOT.NET;ssl=true " -f /home/test1.hql > testresult1.txt
</p>
hql files has only one query "select * from mytable;"
can any one suggest me on this?
Thanks & regards,
Kamleshkumar Gujarathi
Created 04-13-2019 11:55 AM
@Benjamin Leonhardi I am facing a problem while fetching ouput in csv2 format. Since my data itself contains comma hence that specific column is encapsulated in special character "^@". Can you please help of how to avoid it or if I can get the output in a pipe delimited form?
Created 10-22-2019 01:21 PM
Also, just wanted to point out that, depending on the version of Hive being used, may default to buffering within the beeline client. Be sure to enable 'incremental' fetches of data from the Hive server when dealing with large result sets.
--incremental=[true/false]
Defaults to true from Hive 2.3 onwards, before it defaulted to false. When set to false, the entire result set is fetched and buffered before being displayed, yielding optimal display column sizing. When set to true, result rows are displayed immediately as they are fetched, yielding lower latency and memory usage at the price of extra display column padding. Setting --incremental=true is recommended if you encounter an OutOfMemoryException.
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients
Created 10-22-2019 01:42 PM
Another option for large data sets, if ordering doesn't matter, is to create an EXTERNAL table with the necessary delimiters and issue an INSERT statement into the table instead of a SELECT statement in beeline. To copy the data locally, issue a:
hdfs dfs -cat /my/table/*
"Order doesn't matter' because the cat application will not necessarily read the files in proper order. If an ORDER BY is included in the query, the contents of each file will be in order, but the files may be read out of order by the 'cat' application.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/FileSystemShell.html#cat
Created 04-28-2020 07:22 AM
Please try this, this will only output the stadard output of the not error or logs:
beeline -u jdbc:hive2://somehost_ip/ -f hive.hql 2> /dev/null >op.txt
Created 04-28-2020 07:28 AM
Please check below command, here 2> /dev/null will consume all the logs and error. It will now allow standard output to be shown:
beeline -u jdbc:hive2://somehost_ip/ -f 2> /dev/null hive.hql >op.txt
if you like this please give me kudos. Thanks!!!