Created 11-01-2016 04:43 PM
Our cluster is secured using Kerberos. Now I need to run hive queries in a shell script which would be scheduled to run periodically.
In my shell script, I was thinking to use the below commands in sequence
>beeline -u"jdbc:hive2://$hive_server2:10000/$hive_db;principal=$user_principal" >beeline -e"SHOW DATABASES"
But then I realised that once I run the beeline -u command, it would take me to the beeline shell instead of being in the bash shell. So wondering how to get this sorted out. I need to use beeline -e command, but need to connect to the cluster first using kerberos principal. Any ideas whats the best way to handle this?
FYI, we are not using Oozie, but shell script with crontab scheduling.
Created 11-01-2016 05:30 PM
[hdfs@ip-172-31-40-160 ~]$ cat b.sh
beeline << EOF
!connect jdbc:hive2://localhost:10000 hive hive
[hdfs@ip-172-31-40-160 ~]$ sh -x b.sh
Beeline version 1.2.1000.2.5.0.0-1245 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000 hive hive
Connecting to jdbc:hive2://localhost:10000
Connected to: Apache Hive (version 1.2.1000.2.5.0.0-1245)
Driver: Hive JDBC (version 1.2.1000.2.5.0.0-1245)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000> show tables;
| tab_name |
| edl_good_data |
| td_edl_good_data |
| test |
3 rows selected (0.152 seconds)
0: jdbc:hive2://localhost:10000> Closing: 0: jdbc:hive2://localhost:10000
Created 11-01-2016 11:37 PM
Thanks @Neeraj Sabharwal Just wondering if there is any standard approach to this. Without connecting how does someone use hive -e command. i.e. it looks like hive -u and hive -e commands are mutually exclusive to me.
Created 07-10-2018 11:41 AM
cool ! very helpful
Created 01-17-2019 10:18 AM
Although this is an old thread just thought it will useful for someone, we can use both in same line not sure if the older version of beeline didn't support this considering this is a old thread. Anyways the below syntax works for me.
beeline -u"jdbc:hive2://$hive_server2:10000/$hive_db;principal=$user_principal -e"SHOW DATABASES"
Created 01-17-2019 07:44 PM
I needed to generate a file show all of the partitions for several tables. I ended up using the following with hive.
rm -f tables.txt
rm -f partitions.txt
" >> tables.txt
cat tables.txt |while read LINE
echo -e "$LINE\n" >> partitions.txt
hive -e "show partitions $LINE" >>partitions.txt
echo -e "\n" >> partitions.txt
rm -f tables.txt
echo "Show Partitions generated"
It seemed like hive didn't like something cause it threw a couple of errors at the beginning and end. But, ended up generating the desired output.
I'd prefer to use beeline but couldn't figure out how to integrate the beeline -u connection with the looping through the list of tables.
Can anyone suggest a similar approach using beeline?
Created 01-21-2019 10:37 AM
Created 01-18-2019 02:17 PM
Hive shell has security issues & is deprecated in higher versions of hdp please avoid. Opening a hive/beeline shell for every table it will be slow as it has to spawn a jvm for every table so avoid looping approach. Prepare a file with the table list like below.
show partitions table1;
show partitions table2;
show partitions table3;
Use a -f flag of beeline to pass the above file eg below.
beeline --silent=true --showHeader=false --outputformat=csv2 -u "jdbc:hive2://$hive_server2:10000/$hive_db;principal=$user_principal" -n '<username>' -p '<password>' -f show_partitions_tables.hql > show_partitions_tables.txt
Note: Please accept the answer if it solved your issue.
Created 02-04-2019 03:45 PM
Your suggestion worked. At first, I wasn't sure about how to create the hql file. I ended up just using vi to create it with the "show partitions" statements.
Also, in my environment, I don't need to provide the credentials. I'm not sure why. I just provided the jdbc string for the zk cluster like the following:
beeline --silent=true --showHeader=false --outputformat=csv2 -u "jdbc:hive2://zk1:zkport,zk2:zkport,zk3:zkport/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" -f show_partitions_tables.hql > show_partitions_tables.txt
Thanks for your help!
Created 02-06-2019 10:30 AM
@Ken Herring I'm glad it worked, may be your cluster isn't secured. May you please accept the answer if it worked as it will help others.
Created 05-09-2020 01:20 AM
I need a beeline shell script which should connect hive url and print the no. of tables in the db and print the record count of each table in the output file.
Your help will greatly appreciated. I belongs to different technology.