- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Running a Sqoop script in the background appears to 'hang' but works when running in the foreground
- Labels:
-
Apache Sqoop
Created ‎11-03-2015 09:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Consdering the following bash script for Sqoop:
#!/bin/sh
connection_string="jdbc:sqlserver://remoteserver.somehwere.location-asp.com:1433;database=idistrict"
user_name="OPSUser"
db_password="OPSReader"
sqoop_cmd="list-databases"
sqoop $sqoop_cmd --connect $connection_string --username $user_name --password $db_password
We can run this just fine in the foreground, i.e.:
./sqoop_test.sh
But running it in the background like so:
./sqoop_test.sh &
The script appears to 'hang' when kicking off the actual sqoop command...i.e. nothing happens at all.
Using -x on the #!/bin/sh line shows that we end up at the last line of the script and then nothing...
We have tried all kinds of iterations of different commands like:
nohup bash sqoop.sh > results.txt 2>&1 &
./sqoop.sh &> /dev/null &
switched to #!/bin/bash
Any ideas? The odd thing is that the same exact script works fine both foregrounded and backgrounded on a different cluster. /etc/profile, and .bash_profile don't look to have any major differences.
Created ‎11-09-2015 04:29 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Alex Miller I was able to reproduce and personally had luck with screen as well as placing the "&" inside my test script itself at the end of the sqoop command rather than trying to background the script at invocation time (i.e. ./sqoop.sh &).
The /dev/null thing was also successful for me as well with Accumulo in place.
The customer apparently had gone ahead and removed the Accumulo bits before they had a chance to test my suggestions since any further they weren't using it, anyway.
So I really think there isn't a bug and we are hitting some bash-isms here more than anything else.
Thanks, all, for the tips.
Created ‎11-03-2015 09:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Is this cluster setup with queues?
Can you check if sqoop is waiting on other jobs to finish ?
Created ‎11-03-2015 09:41 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Neeraj - No other jobs are waiting to finish and we can run this pretty much at-will in the foreground without things getting seemingly stuck.
Created ‎11-03-2015 09:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
add "`whoami'" and "`hostname`" to the script, see what prints out. I'd also add "2>&1 | tee -a log", i.e. redirect the output of the console to a file to see the output in foreground and background. It should give you some insight to what's happening. What specifically is the reason to running it in the background @Kent Baxley?
Created ‎11-03-2015 09:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
check limits on the user in the background and in the foreground. There may be an OS limit on background processes.
Created ‎11-03-2015 10:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would recommend using screen. It gives you all of the benefits of a background job will all of the benefits of a job running in the foreground as well: https://wiki.archlinux.org/index.php/GNU_Screen
Created ‎11-06-2015 12:58 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Artem Ervits The reason behind the backgrounding is there are quite a few tables with 2+ million records and they would like to run start a sqoop job in the background after hours (there has to be a better way to do this, in my opinion).
Foregrounded:
-bash-4.1$ ./sample_sqoop.sh
whoami sboddu
hostname node2.example.com
2015-11-05 19:15:03,027 INFO - [main:] ~ Running Sqoop version: 1.4.6.2.3.0.0-2557 (Sqoop:92) 2015-11-05 19:15:03,043 WARN - [main:] ~ Setting your password on the command-line is insecure. Consider using -P instead. (BaseSqoopTool:1021)
2015-11-05 19:15:03,248 INFO - [main:] ~ Using default fetchSize of 1000 (SqlManager:98) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/hdp/2.3.0.0-2557/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.3.0.0-2557/zookeeper/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/hdp/2.3.0.0-2557/accumulo/lib/slf4j-log4j12.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
master
tempdb
model msdb
ReportServer
Idistrict_Distributer
idistrict
iDistrict_Audit
iDistrict_SlimDB
iDistrict_Reports
ReportServerTempDB
Idistrict_Replication
iDistrict_Attachment FTL
Backgrounded:
-bash-4.1$ ./sample_sqoop.sh& [2] 33320 -bash-4.1$
whoami sboddu
hostname node2.example.com
[2]+ Stopped ./sample_sqoop.sh
Created ‎11-06-2015 01:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you're sqooping master, model, reportdb, not sure if you need to do that, I would limit the tables just to the ones you need. Other than that, please check ulimit on the user executing the job in foreground and background, http://www.commandlinefu.com/commands/view/9893/find-ulimit-values-of-currently-running-process.
Created ‎11-06-2015 10:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Artem Ervits Turns out that the factor was having the Accumulo Client installed on the machine alongside sqoop.
With Accumulo client in the mix, the sqoop script, if invoked to run in the background, would go into a Stopped state and could only resume if the script were foregrounded using the "fg" command.
Uninstalling the Accumulo client was what ultimately worked-around / fixed the issue.
Not sure if this is a bug or due to the fact that sqoop is self is a bash script that calls another script that sources the configure-sqoop script.
Thanks for your help.
Created ‎11-09-2015 04:19 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Kent Baxley did you have a chance to try using screen before uninstalling Accumulo client? Based on your discovery that redirecting output (sqoop.sh &> /dev/null &) was successful, I would think using screen would also work.
