Created 04-10-2018 02:44 PM
Hi all,
I'm trying to run Sqoop via Oozie as a first test workflow (from the Workflow Manager view in HDP). I've setup my own installation of the current HDP version. However, my workflow always fails to execute, Oozie tells me:
Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1], External status: FAILED/KILLED
I've looked at the instructions at https://community.hortonworks.com/articles/9148/troubleshooting-an-oozie-flow.html to find the real error message, but the stdout log is empty:
Log Type: stdout Log Upload Time: Tue Apr 10 16:29:39 +0200 2018 Log Length: 0
and stderr does not give me any useful information as far as I see this:
Apr 10, 2018 4:30:04 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a provider class Apr 10, 2018 4:30:04 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler as a provider class Apr 10, 2018 4:30:04 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices as a root resource class Apr 10, 2018 4:30:04 PM com.sun.jersey.server.impl.application.WebApplicationImpl _initiate INFO: Initiating Jersey application, version 'Jersey: 1.9 09/02/2011 11:17 AM' Apr 10, 2018 4:30:04 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver to GuiceManagedComponentProvider with the scope "Singleton" Apr 10, 2018 4:30:06 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to GuiceManagedComponentProvider with the scope "Singleton" Apr 10, 2018 4:30:09 PM com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory getComponentProvider INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to GuiceManagedComponentProvider with the scope "PerRequest" log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapreduce.v2.app.MRAppMaster). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Does someone have a good idea what else I could try to get Sqoop running via Oozie? I've looked at quite a few entries here in the community, but found nothing promising so far. Executing Sqoop locally in the shell works fine after fixing the JDBC drivers.
Created 04-10-2018 03:09 PM
Actually I think I manually found the logs for stdout now: I copied the application id from the Hadoop management website and executed the command
yarn logs -applicationId [applicationId]
on command line and there I found what I was looking for (a better log :). For the part I was looking for it said below the log:
End of LogType:stdout *********************************************************************** Container: container_e03_1523366722691_0006_01_000002 on ubuntu_45454 LogAggregationType: AGGREGATED
Can I find this type of log not in the web interface then? Or do I need to check in a different web interface?
Edit: Looks like Hadoop Web interface only knew about attempt Nr. 1 in the job history for the MapReduce job (see the 000002 in the excerpt above). When I manually change the URL of the container ID to 00002 (e.g. ... container_e03_1523366722691_0006_01_000002/job_1523366722691_0006/admin), I can also see the log in the web interface. Confusing 🙂 anyway, it looks like I used the wrong arguments somehow for Sqoop, even though I'm quite sure it worked this way in the console. Problem solved (for now).
Created 04-10-2018 03:09 PM
Actually I think I manually found the logs for stdout now: I copied the application id from the Hadoop management website and executed the command
yarn logs -applicationId [applicationId]
on command line and there I found what I was looking for (a better log :). For the part I was looking for it said below the log:
End of LogType:stdout *********************************************************************** Container: container_e03_1523366722691_0006_01_000002 on ubuntu_45454 LogAggregationType: AGGREGATED
Can I find this type of log not in the web interface then? Or do I need to check in a different web interface?
Edit: Looks like Hadoop Web interface only knew about attempt Nr. 1 in the job history for the MapReduce job (see the 000002 in the excerpt above). When I manually change the URL of the container ID to 00002 (e.g. ... container_e03_1523366722691_0006_01_000002/job_1523366722691_0006/admin), I can also see the log in the web interface. Confusing 🙂 anyway, it looks like I used the wrong arguments somehow for Sqoop, even though I'm quite sure it worked this way in the console. Problem solved (for now).