Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

How to run all of hive queries in a file in one go

Expert Contributor

I have around 15 queries saved in a HQL file. The queries include CREATE TABLE/DROP TABLE and SELECT statement at the end of the file.

It seems like I can not execute all of these in one go.

How do I syntactically format or write hive queries so that hive can understand that we have the next query starting after end of one query .It should then finally should give results of select query.

1 ACCEPTED SOLUTION

Contributor

Hi @Simran Kaur,

You can run the following command to execute a HQL file in hive:

hive -f filename.hql

Be sure each of your queries in your HQL are terminated with a semi-colon ;

Here's the help output of hive too:

$ hive -H
WARNING: Use "yarn jar" to launch YARN applications.
hivusage: hive
 -d,--define <key=value>          Variable subsitution to apply to hive
                                  commands. e.g. -d A=B or --define A=B
    --database <databasename>     Specify the database to use
 -e <quoted-query-string>         SQL from command line
 -f <filename>                    SQL from files
 -H,--help                        Print help information
    --hiveconf <property=value>   Use value for given property
    --hivevar <key=value>         Variable subsitution to apply to hive
                                  commands. e.g. --hivevar A=B
 -i <filename>                    Initialization SQL file
 -S,--silent                      Silent mode in interactive shell
 -v,--verbose                     Verbose mode (echo executed SQL to the
                                  console)

And here is great documentation for all hive cli options: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli

View solution in original post

5 REPLIES 5

Contributor

Hi @Simran Kaur,

You can run the following command to execute a HQL file in hive:

hive -f filename.hql

Be sure each of your queries in your HQL are terminated with a semi-colon ;

Here's the help output of hive too:

$ hive -H
WARNING: Use "yarn jar" to launch YARN applications.
hivusage: hive
 -d,--define <key=value>          Variable subsitution to apply to hive
                                  commands. e.g. -d A=B or --define A=B
    --database <databasename>     Specify the database to use
 -e <quoted-query-string>         SQL from command line
 -f <filename>                    SQL from files
 -H,--help                        Print help information
    --hiveconf <property=value>   Use value for given property
    --hivevar <key=value>         Variable subsitution to apply to hive
                                  commands. e.g. --hivevar A=B
 -i <filename>                    Initialization SQL file
 -S,--silent                      Silent mode in interactive shell
 -v,--verbose                     Verbose mode (echo executed SQL to the
                                  console)

And here is great documentation for all hive cli options: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli

Hi @Simran Kaur. I do not know of issues where hive will run a stream of queries out-of-order. If you paste in text or run a file it should execute in-order.

If you have 10 queries in a row and the 7th one fails - the Hive execution ends immediately.

If you are looking to add more flow control, error trapping, etc. to your 15 queries, you may want to look into HPL/SQL. It is an add-on to HDP 2.4, but will be included in the upcoming HDP 2.5 release. It lets you use if/then/else, stored procedures, local variables, etc. in your HQL file.

Check it out at http://www.hplsql.org/start

Syntax Reference: http://www.hplsql.org/doc

Rising Star

Alternatively, you can use beeline instead of the Hive CLI. Here is an example using beeline running a file with a parameter:

beeline -u "jdbc:hive2://master01:2181,master02:2181,master03:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" -f file.hql --hivevar HDFSDIR=/tmp/folder

Contents of file.hql:

USE myhivedb;
-- a comment
LOAD DATA INPATH '${HDFSDIR}/browser.tsv' OVERWRITE INTO TABLE browser;
-- other queries

@Kit Menke - Hey Kit , I have requirement where the users need to execute a query using beeline from hdfs.I tried your approach however i have tried several versions of it and the outcome unfortunately contradicts your posts.Can beeline access hdfs uri?? It would be great help if you could share your thoughts on this.

beeline -u "jdbc:hive2://namenode2.dc.corp.astro.com:2181,namenode1.dc.corp.astro.com:2181,namenode3.dc.corp.astro.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNameSpace=hiveserver2sockettimeout=600000;tcpKeepAlive=true" -n xxx  -p ******* -f "City.sql" --verbose true --hivevar HDFSDIR="hdfs://namenode1.dc.corp.astro.com:8020/user/xxx"

############ OUTPUT ##########################
Connected to: Apache Hive (version 1.2.1000.2.6.0.3-8)
Driver: Hive JDBC (version 1.2.1000.2.6.0.3-8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
City.sql (No such file or directory)
##############################################
Option 2: 
beeline -u "jdbc:hive2://namenode2.dc.corp.astro.com:2181,namenode1.dc.corp.astro.com:2181,namenode3.dc.corp.astro.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNameSpace=hiveserver2sockettimeout=600000;tcpKeepAlive=true" -n xxx  -p ******* -f "hdfs://namenode1.dc.corp.astro.com:8020/user/xxx/City.sql" --verbose true  


############ OUTPUT ##########################
Connected to: Apache Hive (version 1.2.1000.2.6.0.3-8)
Driver: Hive JDBC (version 1.2.1000.2.6.0.3-8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
City.sql (No such file or directory)
##############################################



Option 3:

beeline -u "jdbc:hive2://namenode2.dc.corp.astro.com:2181,namenode1.dc.corp.astro.com:2181,namenode3.dc.corp.astro.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNameSpace=hiveserver2sockettimeout=600000;tcpKeepAlive=true" -n xxx  -p ******* -f "hdfs://user/xxx/City.sql" --verbose true  



############ OUTPUT ##########################
Connected to: Apache Hive (version 1.2.1000.2.6.0.3-8)
Driver: Hive JDBC (version 1.2.1000.2.6.0.3-8)
Transaction isolation: TRANSACTION_REPEATABLE_READ
City.sql (No such file or directory)
##############################################

New Contributor

Hi Team,

By Following command you can execute the hql file using beeline command and take the execution output to text file

beeline -u "jdbc:hive2://master:10000/;principal=hive/master@DOMAIN.NET" 
        --incremental=true 
        --showHeader=true  
        --verbose =false  
        --outputformat=dsv  
        --delimiterForDSV="~"  
        -f /home/userfoldername/testquery.hql>file_correspond99.txt

Thanks & Regards,

Kamleshkumar Gujarathi