Created 11-01-2016 04:43 PM
Hi,
Our cluster is secured using Kerberos. Now I need to run hive queries in a shell script which would be scheduled to run periodically.
In my shell script, I was thinking to use the below commands in sequence
>beeline -u"jdbc:hive2://$hive_server2:10000/$hive_db;principal=$user_principal" >beeline -e"SHOW DATABASES"
But then I realised that once I run the beeline -u command, it would take me to the beeline shell instead of being in the bash shell. So wondering how to get this sorted out. I need to use beeline -e command, but need to connect to the cluster first using kerberos principal. Any ideas whats the best way to handle this?
FYI, we are not using Oozie, but shell script with crontab scheduling.
Thanks
Created 11-01-2016 05:30 PM
Try this
[hdfs@ip-172-31-40-160 ~]$ cat b.sh
beeline << EOF
!connect jdbc:hive2://localhost:10000 hive hive
show tables
EOF
[hdfs@ip-172-31-40-160 ~]$ sh -x b.sh
+ beeline
Beeline version 1.2.1000.2.5.0.0-1245 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000 hive hive
Connecting to jdbc:hive2://localhost:10000
Connected to: Apache Hive (version 1.2.1000.2.5.0.0-1245)
Driver: Hive JDBC (version 1.2.1000.2.5.0.0-1245)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000> show tables;
+-------------------+--+
| tab_name |
+-------------------+--+
| edl_good_data |
| td_edl_good_data |
| test |
+-------------------+--+
3 rows selected (0.152 seconds)
0: jdbc:hive2://localhost:10000> Closing: 0: jdbc:hive2://localhost:10000
[hdfs@ip-172-31-40-160 ~]$
Created 11-01-2016 11:37 PM
Thanks @Neeraj Sabharwal Just wondering if there is any standard approach to this. Without connecting how does someone use hive -e command. i.e. it looks like hive -u and hive -e commands are mutually exclusive to me.
Created 07-10-2018 11:41 AM
cool ! very helpful
Created 01-17-2019 10:18 AM
Although this is an old thread just thought it will useful for someone, we can use both in same line not sure if the older version of beeline didn't support this considering this is a old thread. Anyways the below syntax works for me.
beeline -u"jdbc:hive2://$hive_server2:10000/$hive_db;principal=$user_principal -e"SHOW DATABASES"
Created 01-17-2019 07:44 PM
Thanks @Naveen
I needed to generate a file show all of the partitions for several tables. I ended up using the following with hive.
#!/home/user/
rm -f tables.txt
rm -f partitions.txt
echo "
dbname.tablename1
dbname.tablename2
dbname.tablename3
dbname.tablename4
" >> tables.txt
wait
cat tables.txt |while read LINE
do
echo -e "$LINE\n" >> partitions.txt
hive -e "show partitions $LINE" >>partitions.txt
echo -e "\n" >> partitions.txt
done
rm -f tables.txt
echo "Show Partitions generated"
It seemed like hive didn't like something cause it threw a couple of errors at the beginning and end. But, ended up generating the desired output.
I'd prefer to use beeline but couldn't figure out how to integrate the beeline -u connection with the looping through the list of tables.
Can anyone suggest a similar approach using beeline?
Created 01-21-2019 10:37 AM
@Ken Herring Did you try the above suggested approach, let me know if you have issues.
Created 01-18-2019 02:17 PM
Hive shell has security issues & is deprecated in higher versions of hdp please avoid. Opening a hive/beeline shell for every table it will be slow as it has to spawn a jvm for every table so avoid looping approach. Prepare a file with the table list like below.
cat show_partitions_tables.hql
show partitions table1;
show partitions table2;
show partitions table3;
Use a -f flag of beeline to pass the above file eg below.
beeline --silent=true --showHeader=false --outputformat=csv2 -u "jdbc:hive2://$hive_server2:10000/$hive_db;principal=$user_principal" -n '<username>' -p '<password>' -f show_partitions_tables.hql > show_partitions_tables.txt
Note: Please accept the answer if it solved your issue.
Created 02-04-2019 03:45 PM
Thanks @Naveen
Your suggestion worked. At first, I wasn't sure about how to create the hql file. I ended up just using vi to create it with the "show partitions" statements.
Also, in my environment, I don't need to provide the credentials. I'm not sure why. I just provided the jdbc string for the zk cluster like the following:
beeline --silent=true --showHeader=false --outputformat=csv2 -u "jdbc:hive2://zk1:zkport,zk2:zkport,zk3:zkport/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2" -f show_partitions_tables.hql > show_partitions_tables.txt
Thanks for your help!
Created 02-06-2019 10:30 AM
@Ken Herring I'm glad it worked, may be your cluster isn't secured. May you please accept the answer if it worked as it will help others.