Support Questions

Find answers, ask questions, and share your expertise

Unable to successfully launch beeline script from linux shell script with nohup

avatar
Contributor

I am trying to run beeline script called from a linux shell script using nohup. When I do this, the script "hangs" until I issue a Linux kill command for the process associated with the script launched with nohup. Can someone help me get this to work so that I can launch the script in the background and have it finish without having to issue the Linux kill command?

#Code for beeline script simple_query.hql
----------------------------------------------
use hdp_ground;
select scan_create_dt_part from offload_scan_detail_orc where scan_create_dt_part = '20171001' limit 5; 
----------------------------------------------
#Code for Linux shell script run_ beeline_hql.sh
----------------------------------------------
#run environment script
THISFILE='run_beeline_hql'
EXT1=$(date +%y%m%d)
EXT2=$(date +%H%M%S)
. $(dirname $0)/srv.env
exec > $output_dir/${THISFILE}_$EXT1.$EXT2.log 2>&1
chmod 666 $output_dir/${THISFILE}_$EXT1.$EXT2.log
beeline -f simple_query.hql
exit
----------------------------------------------
What I type to launch the Linux script:
----------------------------------------------
nohup ./run_beeline_hql.sh &
----------------------------------------------
What I see on my screen:
nohup ./run_beeline_hql.sh &
[1]     58486
[xxxx/home/xxxx/xxx/xxx]$ nohup: ignoring input and appending output to `nohup.out'
[1] + Stopped (SIGTTOU)        nohup ./run_beeline_hql.sh &
--------------------------
when I do ps -ef | grep 58486 it appears the script is still running and never finishes:
--------------------------------------------------------
[xxx/home/xxx]$ ps -ef | grep 58486
xxx 45124 31863  0 17:35 pts/0    00:00:00 grep 58486
xxx 58486 31863  0 17:13 pts/0    00:00:00 /bin/sh ./run_beeline_hql.sh
xxx 58493 58486  0 17:13 pts/0    00:00:00 /opt/java/hotspot/7/64_bit/jdk1.7.0_79/bin/java -Xmx12288m -Dhdp.version=2.4.2.0-258 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.4.2.0-258 -Dhadoop.log.dir=/var/hadoop/log/hadoop/f5057708 -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.4.2.0-258/hadoop -Dhadoop.id.str=f5057708 -Dhadoop.root.logger=INFO,console -Djava.library.path=:/usr/hdp/2.4.2.0-258/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.4.2.0-258/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx12288m -XX:MaxPermSize=512m -Dlog4j.configuration=beeline-log4j.properties -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/hdp/2.4.2.0-258/hive/lib/hive-beeline-1.2.1000.2.4.2.0-258.jar org.apache.hive.beeline.BeeLine -u jdbc:hive2://xxx;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2 -f simple_query.hql
-------------------------------
The output of the script looks like this:
-----------------------------------------
WARNING: Use "yarn jar" to launch YARN applications.
-----------------------------------------
when I issue the kill command for the process:
kill 58486
the script finishes and the output gets written
----------------------------------------------
Connecting to jdbc:hive2://xxx;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
Connected to: Apache Hive (version 1.2.1000.2.4.2.0-258)
Driver: Hive JDBC (version 1.2.1000.2.4.2.0-258)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://xxx,> use hdp_ground;
No rows affected (0.062 seconds)
0: jdbc:hive2://xxx,> select scan_create_dt_part from offload_scan_detail_orc where scan_create_dt_part = '201 
71001' limit 5;
+----------------------+--+
| scan_create_dt_part  |
+----------------------+--+
| 20171001             |
| 20171001             |
| 20171001             |
| 20171001             |
| 20171001             |
+----------------------+--+
5 rows selected (0.127 seconds)
0: jdbc:hive2://xxx,> 
Closing: 0: jdbc:hive2://xxx/xxx;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2














8 REPLIES 8

avatar
Super Collaborator

I also tried that once and it didn't seem to work for some reason.

Please try using 'screen' utility.

https://www.rackaid.com/blog/linux-screen-tutorial-and-how-to/

Use Ctrl+a+c to create a new one, run your script there without nohup, that is ./run_beeline_hql.sh and detach from that session by using Ctrl+a+d.

The process will keep running in the background which you can check by ps.

avatar
Expert Contributor

Great tool. Thanks for sharing.

In case you want to automate the script nohup is the only way to go ahead I think.

avatar
Expert Contributor

The answer to your problem is simple, before coming to solution let me break my answers into steps:

I think you want the output of your code in one file and change the permission of that log file, which you also want to be read in some other script or might be some automation you want to do. So, I assume that you are trying to use exec command for same.

Let's try to understand why your script is misbehaving in this manner. Well, it's not it is just following the protocol and command mentioned in your script.

Exec is the way of running the command in Linux which doesn't spawn a new pid and utilize the same pid of the current shell. So, ideally when you run a command mentioned below in a Linux shell, you will notice that it stops listening and goes in hanging state, but it's not!!

 exec >1.log 2>&1

We have closed the doors with our own hands, i.e., pointing standard error and output to a log file. So, inside your script when this command is executed your child-process goes in the same state.

Now, let's understand how nohup and background process works. When you try executing your script using

nohup ./x.sh &

nohup takes terminal as standard out and error and point your .sh output to "nohup.out" which is quite expected (To understand this I would recommend having a look at man page for nohup). And this is the reason you are getting below the line on your terminal:

nohup ./run_beeline_hql.sh &
[1]     58486
[xxxx/home/xxxx/xxx/xxx]$ nohup: ignoring input and appending output to `nohup.out'
[1] + Stopped (SIGTTOU)        nohup ./run_beeline_hql.sh &

Now, we know when your script start running it goes into that state and doesn't do much unless you send a kill command which is when the process tries to kill itself gracefully and complete the task and exit out. The solution to ignore your nohup errors use the below-mentioned piece of code which will help you achieve the same with very little code changes :

#!/bin/sh

#Declaring Variables
THISFILE='run_beeline_hql'
EXT1=$(date +%y%m%d)
EXT2=$(date +%H%M%S)
. $(dirname $0)/srv.env

# Close STDOUT file descriptor
exec 1<&-
# Close STDERR FD
exec 2<&-
# Open STDOUT as $THISFILE file for read and write.
exec 1<>$THISFILE

# Redirect STDERR to STDOUT
exec 2>&1

chmod 666 $output_dir/${THISFILE}_$EXT1.$EXT2.log
beeline -f simple_query.hql
exit


In case, you also don't want nohup error than running the script like below would be a better idea:

nohup ./run_beeline_hql.sh 1>Temp.log 2>&1 &

Do let me know if this worked out or not?

Thanks,

SKS

avatar
Contributor

I should also add:

In order to get the script to end I have to issue the Linux kill command.

Once I kill the pid, the script completes and finishes writing its output.

Before issuing the kill command this is what I see as output:

WARNING: Use "yarn jar" to launch YARN applications.

After issuing the kill command the process completes and writes its output:

Connecting to jdbc:hive2://xxx;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
Connected to: Apache Hive (version 1.2.1000.2.4.2.0-258)
Driver: Hive JDBC (version 1.2.1000.2.4.2.0-258)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://xxx,> use hdp_ground;
No rows affected (0.071 seconds)
0: jdbc:hive2://xxx,> select scan_create_dt_part from offload_scan_detail_orc where scan_create_dt_part = '20171001' limit 5;
+----------------------+--+
| scan_create_dt_part  |
+----------------------+--+
| 20171001             |
| 20171001             |
| 20171001             |
| 20171001             |
| 20171001             |
+----------------------+--+
5 rows selected (0.135 seconds)
0: jdbc:hive2://xxx,> 
Closing: 0: jdbc:hive2://xxx;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2



avatar
Expert Contributor

@Carol Elliott Did you try running the script with the modification I have mentioned?

avatar

avatar
New Contributor

Is it possible to use Shell command from Beeline Shell? Just like we can use shell command from the hive as !ls or !cat. Please let me know while I am trying to use the !ls command it is not working from beeline but it works fine in the hive.@Carol Elliott@Rahul Pathak

,

I have a question regarding running a shell command from beeline...As we can run Shell command from hive shell like: !ls,!cat etc.

Is it possible to run shell command from Beeline shell.....I tried but its shows undefined or unnamed syntax. Could you Please help me with this.

avatar
Rising Star

@carol elliott

you need to set client option for unsupport terminal prior to launching via nohup:

export HADOOP_CLIENT_OPTS="-Djline.terminal=jline.UnsupportedTerminal"

nohup beeline -f foo.sql -u ${jdbcurl} >> nohup_beelineoutput.out &