Member since
08-17-2016
2
Posts
0
Kudos Received
0
Solutions
08-19-2016
10:35 AM
Thanks for your reply. The reason for using shell script and not parellel job design is to make it dynamic and handle if any new schemas (for example I am usinng vpds but will be schemas in prod) are added in future. The logs I posted are from stdout tab. Please advise. stdout standard logs above
HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs:
HADOOP_CLIENT_OPTS=:
PREVLEVEL=N:
CONTAINER_ID=container_1471289760180_0031_01_000002:
HOME=/home/:
LANG=en_US.UTF-8:
YARN_NICENESS=0:
YARN_IDENT_STRING=yarn:
HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce:
=================================================================
>>> Invoking Shell command line now >>
Stdoutput vpd1
Stdoutput ------------------------------------------------------------------------------------------------
Stdoutput vpd2
Stdoutput ------------------------------------------------------------------------------------------------
Stdoutput vpd3
Stdoutput ------------------------------------------------------------------------------------------------
Stdoutput Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Stdoutput Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Stdoutput Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Stdoutput Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Exit code of the Shell command 0
<<< Invocation of Shell command completed <<<
<<< Invocation of Main class completed <<<
Oozie Launcher ends stderr -- empty --Note: Recompile with -Xlint:deprecation for details. syslog 2016-08-19 08:38:26,713 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2016-08-19 08:38:26,714 INFO [main] org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system started
2016-08-19 08:38:26,727 INFO [main] org.apache.hadoop.mapred.YarnChild: Executing with tokens:
2016-08-19 08:38:26,727 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: mapreduce.job, Service: job_1471289760180_0031, Ident: (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@441357d7)
2016-08-19 08:38:26,780 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: RM_DELEGATION_TOKEN, Service: 127.0.0.1:8032, Ident: (owner=abhishek, renewer=oozie mr token, realUser=oozie, issueDate=1471621096236, maxDate=1472225896236, sequenceNumber=92, masterKeyId=5)
2016-08-19 08:38:26,898 INFO [main] org.apache.hadoop.mapred.YarnChild: Sleeping for 0ms before retrying again. Got null now.
2016-08-19 08:38:27,278 INFO [main] org.apache.hadoop.mapred.YarnChild: mapreduce.cluster.local.dir for child: /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/abhishek/appcache/application_1471289760180_0031
2016-08-19 08:38:27,861 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
2016-08-19 08:38:28,482 INFO [main] org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: File Output Committer Algorithm version is 1
2016-08-19 08:38:28,502 INFO [main] org.apache.hadoop.mapred.Task: Using ResourceCalculatorProcessTree : [ ]
2016-08-19 08:38:28,830 INFO [main] org.apache.hadoop.mapred.MapTask: Processing split: org.apache.oozie.action.hadoop.OozieLauncherInputFormat$EmptySplit@181838a7
2016-08-19 08:38:28,841 INFO [main] org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
2016-08-19 08:38:28,880 INFO [main] org.apache.hadoop.conf.Configuration.deprecation: mapred.job.id is deprecated. Instead, use mapreduce.job.id
2016-08-19 08:38:43,002 INFO [main] org.apache.hadoop.mapred.Task: Task:attempt_1471289760180_0031_m_000000_0 is done. And is in the process of committing logs from yarn Log Type: stderr
Log Upload Time: Fri Aug 19 08:38:51 -0700 2016
Log Length: 2275
Aug 19, 2016 8:38:25 AM com.google.inject.servlet.InternalServletModule$BackwardsCompatibleServletContextProvider get
WARNING: You are attempting to use a deprecated API (specifically, attempting to @Inject ServletContext inside an eagerly created singleton. While we allow this for backwards compatibility, be warned that this MAY have unexpected behavior if you have more than one injector (with ServletModule) running in the same JVM. Please consult the Guice documentation at http://code.google.com/p/google-guice/wiki/Servlets for more information.
Aug 19, 2016 8:38:25 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider class
Aug 19, 2016 8:38:25 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class
Aug 19, 2016 8:38:25 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as a root resource class
Aug 19, 2016 8:38:25 AM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
INFO: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM'
Aug 19, 2016 8:38:25 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton"
Aug 19, 2016 8:38:26 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton"
Aug 19, 2016 8:38:26 AM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider
INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest"
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Log Type: stdout
Log Upload Time: Fri Aug 19 08:38:51 -0700 2016
Log Length: 0
Log Type: syslog
Log Upload Time: Fri Aug 19 08:38:51 -0700 2016
Log Length: 26243
Showing 4096 bytes of 26243 total. Click here for the full log.
ra:8020/tmp/hadoop-yarn/staging/history/done_intermediate/abhishek/job_1471289760180_0031_conf.xml_tmp
2016-08-19 08:38:44,022 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Copied to done location: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/abhishek/job_1471289760180_0031_conf.xml_tmp
2016-08-19 08:38:44,033 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/abhishek/job_1471289760180_0031.summary_tmp to hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/abhishek/job_1471289760180_0031.summary
2016-08-19 08:38:44,036 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/abhishek/job_1471289760180_0031_conf.xml_tmp to hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/abhishek/job_1471289760180_0031_conf.xml
2016-08-19 08:38:44,043 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Moved tmp to done: hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/abhishek/job_1471289760180_0031-1471621096430-abhishek-oozie%3Alauncher%3AT%3Dshell%3AW%3Dtestmultiplescoop%3AA%3Dshell-1471621123619-1-0-SUCCEEDED-root.abhishek-1471621102427.jhist_tmp to hdfs://quickstart.cloudera:8020/tmp/hadoop-yarn/staging/history/done_intermediate/abhishek/job_1471289760180_0031-1471621096430-abhishek-oozie%3Alauncher%3AT%3Dshell%3AW%3Dtestmultiplescoop%3AA%3Dshell-1471621123619-1-0-SUCCEEDED-root.abhishek-1471621102427.jhist
2016-08-19 08:38:44,045 INFO [Thread-71] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Stopped JobHistoryEventHandler. super.stop()
2016-08-19 08:38:44,046 INFO [Thread-71] org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl: KILLING attempt_1471289760180_0031_m_000000_0
2016-08-19 08:38:44,047 INFO [Thread-71] org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy: Opening proxy : quickstart.cloudera:39926
2016-08-19 08:38:44,121 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1471289760180_0031_m_000000_0 TaskAttempt Transitioned from SUCCESS_FINISHING_CONTAINER to SUCCEEDED
2016-08-19 08:38:44,125 INFO [Thread-71] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Setting job diagnostics to
2016-08-19 08:38:44,127 INFO [Thread-71] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: History url is http://quickstart.cloudera:19888/jobhistory/job/job_1471289760180_0031
2016-08-19 08:38:44,140 INFO [Thread-71] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Waiting for application to be successfully unregistered.
2016-08-19 08:38:45,142 INFO [Thread-71] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Final Stats: PendingReds:0 ScheduledMaps:0 ScheduledReds:0 AssignedMaps:1 AssignedReds:0 CompletedMaps:1 CompletedReds:0 ContAlloc:1 ContRel:0 HostLocal:0 RackLocal:0
2016-08-19 08:38:45,145 INFO [Thread-71] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://quickstart.cloudera:8020 /tmp/hadoop-yarn/staging/abhishek/.staging/job_1471289760180_0031
2016-08-19 08:38:45,156 INFO [Thread-71] org.apache.hadoop.ipc.Server: Stopping server on 55341
2016-08-19 08:38:45,158 INFO [IPC Server listener on 55341] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 55341
2016-08-19 08:38:45,163 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2016-08-19 08:38:45,164 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted
2016-08-19 08:38:45,165 INFO [Ping Checker] org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: TaskAttemptFinishingMonitor thread interrupted
... View more
08-17-2016
10:50 AM
Hello all, I am trying to run a shell script which kicks off sqoop jobs in parellel. The idea is each sqoop job will collect data from each schema and all these sqoop jobs run in parellel and add this data to a hive table. The shell script works just fine if I run command line, but stops abruptly when invoked in oozie. The logs show kicking off multiple sqoop jobs but after that it just stops. Below is shell script and logs. Please advise. #!/bin/bash
filename="$1"
#Function to parse comma separated input
parse() {
IFS=',' read -a array <<< "$1"
vpdKey="${array[0]}"
schemaName="${array[1]}"
echo "$vpdKey"
}
while read -r line
do
vpd="$line"
parse $vpd
# Invoke sqoop job
vpd=$vpdKey
schema=$schemaName
sqoop import --connect jdbc:oracle:thin:@//connection string --username blah —password blah --table "$schema".mytable -m 1 --where "vpd_key='"$vpd"' " --compression-codec=snappy --as-parquetfile --warehouse-dir=/user/hive/blah --hive-import --hive-table newhivetable &
echo "------------------------------------------------------------------------------------------------"
done < "$filename" Log tdoutput Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Stdoutput Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Stdoutput Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Stdoutput Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Stdoutput Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Stdoutput Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Stdoutput 2016-08-17 08:24:04,145 INFO [main] sqoop.Sqoop (Sqoop.java:(92)) - Running Sqoop version: 1.4.6-cdh5.7.0
Stdoutput 2016-08-17 08:24:04,255 WARN [main] tool.BaseSqoopTool (BaseSqoopTool.java:applyCredentialsOptions(1023)) - Setting your password on the command-line is insecure. Consider using -P instead.
Stdoutput 2016-08-17 08:24:04,278 INFO [main] tool.BaseSqoopTool (BaseSqoopTool.java:validateOutputFormatOptions(1355)) - Using Hive-specific delimiters for output. You can override
Stdoutput 2016-08-17 08:24:04,279 INFO [main] tool.BaseSqoopTool (BaseSqoopTool.java:validateOutputFormatOptions(1356)) - delimiters with --fields-terminated-by, etc.
Stdoutput 2016-08-17 08:24:04,291 INFO [main] sqoop.Sqoop (Sqoop.java:(92)) - Running Sqoop version: 1.4.6-cdh5.7.0
Exit code of the Shell command 0
<<< Invocation of Shell command completed <<<
<<< Invocation of Main class completed <<<
Oozie Launcher ends
... View more
Labels: