Created on 03-29-2016 01:38 PM - edited 09-16-2022 03:11 AM
Hello everyone,
I come with an error from one of our job which is not really explicit...
I find another topic about the same error but seems to haven't the same origin.
To reproduce the problem I start from my VM this oozie job (we have a standalone labs in Cloudera 5.5.2 remote server)
The command to start the job:
oozie job -oozie http://host.domain.com:11000/oozie -config config-default.xml -run
The content of config-default.xml file:
<configuration> <property><name>job_tracker</name><value>host.domain.com:8032</value></property> <property><name>job_xml</name><value>/path/to/file/hive-site.xml</value></property> <property><name>name_node</name><value>hdfs://host.domain.com:8020</value></property> <property><name>oozie.libpath</name><value>${name_node}/user/oozie/share/lib/lib_20160216173849</value></property> <property><name>oozie.use.system.libpath</name><value>true</value></property> <property><name>oozie.wf.application.path</name><value>${name_node}/path/to/file/simple-etl-wf.xml</value></property> <property><name>db_user</name><value>user</value></property> <property><name>db_pass</name><value>password</value></property> <property><name>target_dir</name><value>/path/to/destination</value></property> <property><name>hive_db_schema</name><value>default</value></property> <property><name>table_suffix</name><value>specific_suffix</value></property> </configuration>
Try to set the "job"tracker" with http:// but we have the same error.
The content of the simple-etl-wf.xml file
<workflow-app xmlns="uri:oozie:workflow:0.5" name="simple-etl-wf"> <global> <job-tracker>${name_node}</job-tracker> <name-node>${job_tracker}</name-node> <job-xml>${job_xml}</job-xml> </global> <start to="extract"/> <fork name="extract"> <path start="table" /> </fork> <action name="table"> <sqoop xmlns="uri:oozie:sqoop-action:0.4"> <arg>import</arg> <arg>--connect</arg> <arg>jdbc:mysql://db.domain.com/database</arg> <arg>username</arg> <arg>${db_user}</arg> <arg>password</arg> <arg>${db_pass}</arg> <arg>--table</arg> <arg>table</arg> <arg>--target-dir</arg> <arg>${target_dir}/table</arg> <arg>--split-by</arg> <arg>column</arg> <arg>--hive-import</arg> <arg>--hive-overwrite</arg> <arg>--hive-table</arg> <arg>${hive_db_schema}.table_${table_suffix}</arg> </sqoop> <ok to="join"/> <error to="fail"/> </action> <join name="join" to="transform" /> <action name="transform"> <hive xmlns="uri:oozie:hive-action:0.4"> <script>script.hql</script> <param>hive_db_schema=${hive_db_schema}</param> <param>table_suffix=${table_suffix}</param> </hive> <ok to="end"/> <error to="fail"/> </action> <kill name="fail"> <message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app>
The job is start, but it block to 20% about. And we have this error:
JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
2016-03-29 15:45:17,149 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[host.domain.com] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000004-160325161246127-oozie-oozi-W] ACTION[0000004-160325161246127-oozie-oozi-W@session] Error starting action [session]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.] org.apache.oozie.action.ActionExecutorException: JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:454) at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:434) at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1032) at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1203) at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250) at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64) at org.apache.oozie.command.XCommand.call(XCommand.java:286) at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321) at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250) at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75) at org.apache.hadoop.mapred.JobClient.init(JobClient.java:472) at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:450) at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:436) at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:434) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:434) at org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1246) at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:980) ... 10 more
Or the job_tracker and name_node have the good url and path. The mysql-connector-java.jar is present in the sharlib folder.
I put oozie un debug mode but no more information about that.
The "mapreduce.framework.name" is set to "yarn on each xml configuration file the the cluster."
Have you any idea about this error ?
Created 04-06-2016 03:29 PM
Hi tseader,
Sorry I wasn't avaiable !
For update, It works. The problem was the "Dynamic ressrouce pool".
I create a resource pool for my username, and now the job is starting and runing.
It was different from our Cloudera 4 in how it works...
So now the job is runing, doing the sqoop and the hive job, and terminate successfuly ! Great news!
But it very slow for a small table import, I think there is something to do in Dynamic resource pool or yarn setting to use more resource cause, during the job, cpu/emory of my 2 datanode was very less...
Maybe you can give me some informations on how to calculate the the max container possible ?
To give you some answer:
- Yes sqoop was working alone.
- Yes our analytics use <args> cause sometime in CDH4 with <command>, they were some error with specific caracters.
- Now yes, sqoop/oozie/hive works now. We will try Impala now
- No we doesn't try to create a workflow since Hue. I will see with our dev about that.
- Not, didn't try with another db.
As you thinking, the problem wasn't come from the workflow but the configuration.
I'm new in Cloudera/Hadoop, so I learn! I discover the configuration with time!
Now I've to find the best configuration to a better usage of our datanode...
Thanks again tseader!
Created 03-31-2016 03:20 PM
Just to check, shouldn't the username and password have double-hyphens in the Sqoop args or does it not matter?
Just want to eliminate any confounding variables 🙂
Created on 03-31-2016 04:23 PM - edited 03-31-2016 04:25 PM
Hi tseader ! Thanks for your help !
Good eyes! Yes I think it could be a error ans stop the process. I modify these 2 settings but the problem still present.
It seems to be before.
It seems to read this file, but never start the mysql connection process.
I have something else in oozie logs.
When I launch the command on my VM, the workflow appears in Hue.
But the log start with theses 2 lines:
2016-03-31 19:04:18,709 WARN org.apache.oozie.util.ParameterVerifier: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition 2016-03-31 19:04:18,744 WARN org.apache.oozie.service.LiteWorkflowAppService: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] libpath [hdfs://hostname.domain.com:8020/path/to/oozie/lib] does not exist
Or its not the libpath that I give in th my job file...
The complete log, since the job is started to the end.
2016-03-31 19:04:18,709 WARN org.apache.oozie.util.ParameterVerifier: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition 2016-03-31 19:04:18,744 WARN org.apache.oozie.service.LiteWorkflowAppService: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] libpath [hdfs://hostname.domain.com:8020/path/to/oozie/lib] does not exist 2016-03-31 19:04:18,805 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@:start:] Start action [0000001-160331185825562-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10] 2016-03-31 19:04:18,809 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@:start:] [***0000001-160331185825562-oozie-oozi-W@:start:***]Action status=DONE 2016-03-31 19:04:18,809 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@:start:] [***0000001-160331185825562-oozie-oozi-W@:start:***]Action updated in DB! 2016-03-31 19:04:18,898 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@extract] Start action [0000001-160331185825562-oozie-oozi-W@extract] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10] 2016-03-31 19:04:18,907 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@extract] [***0000001-160331185825562-oozie-oozi-W@extract***]Action 2016-03-31 19:04:18,907 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@extract] [***0000001-160331185825562-oozie-oozi-W@extract***]Action updated in DB! 2016-03-31 19:04:19,077 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@session] Start action [0000001-160331185825562-oozie-oozi-W@session] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10] 2016-03-31 19:04:22,804 WARN org.apache.hadoop.security.UserGroupInformation: SERVER[hostname] PriviledgedActionException as:username (auth:PROXY) via oozie (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: httpstatus=DONE 2016-03-31 19:04:22,805 WARN org.apache.hadoop.security.UserGroupInformation: SERVER[hostname] PriviledgedActionException as:username (auth:PROXY) via oozie (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. 2016-03-31 19:04:22,805 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@session] Error starting action [session]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.] org.apache.oozie.action.ActionExecutorException: JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:454) at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:434) at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1032) at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1203) at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250) at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64) at org.apache.oozie.command.XCommand.call(XCommand.java:286) at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321) at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250) at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75) at org.apache.hadoop.mapred.JobClient.init(JobClient.java:472) at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:450) at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:436) at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:434) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:434) at org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1246) at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:980)
Maybe it can give you an idea !
I check hostname in configuration file, but it seems to be ok.
This error message is not very clear...
Created 03-31-2016 05:03 PM
The config-default.xml has "host.domain.com" because you wanted to generalize it, right? I'm assuming you've tried localhost with the proper port in your job_tracker and name_node values?
Created on 04-01-2016 04:47 PM - edited 04-01-2016 04:47 PM
Yes we always used the real fqdn to start job.
And we find the mistake. The "job_tracker" was on "name_node" job and vice-versa...
So we change them to be valid, and the process start but it doesn't complish anymore.
We see the worklows in "running" status, and see a job "oozie:launcher" on running state. It create an "oozie:action" task, but this last stay on the "accepted" status. And I don't find why.
I try some settings in Yarn memory configuration with no success.
In RessourceManager, I can find this log about he job:
>>> Invoking Sqoop command line now >>> 4624 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.SqoopTool - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 4654 [uber-SubtaskRunner] INFO org.apache.sqoop.Sqoop - Running Sqoop version: 1.4.6-cdh5.5.2 4671 [uber-SubtaskRunner] WARN org.apache.sqoop.tool.BaseSqoopTool - Setting your password on the command-line is insecure. Consider using -P instead. 4672 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.BaseSqoopTool - Using Hive-specific delimiters for output. You can override 4672 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.BaseSqoopTool - delimiters with --fields-terminated-by, etc. 4690 [uber-SubtaskRunner] WARN org.apache.sqoop.ConnFactory - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration. 4816 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.MySQLManager - Preparing to use a MySQL streaming resultset. 4820 [uber-SubtaskRunner] INFO org.apache.sqoop.tool.CodeGenTool - Beginning code generation 5360 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `table` AS t LIMIT 1 5521 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.SqlManager - Executing SQL statement: SELECT t.* FROM `table` AS t LIMIT 1 5616 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce 7274 [uber-SubtaskRunner] INFO org.apache.sqoop.orm.CompilationManager - Writing jar file: /tmp/sqoop-yarn/compile/f695dd68db2ed1ecf703a5405d308df5/table.jar 7282 [uber-SubtaskRunner] WARN org.apache.sqoop.manager.MySQLManager - It looks like you are importing from mysql. 7282 [uber-SubtaskRunner] WARN org.apache.sqoop.manager.MySQLManager - This transfer can be faster! Use the --direct 7282 [uber-SubtaskRunner] WARN org.apache.sqoop.manager.MySQLManager - option to exercise a MySQL-specific fast path. 7282 [uber-SubtaskRunner] INFO org.apache.sqoop.manager.MySQLManager - Setting zero DATETIME behavior to convertToNull (mysql) 7284 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.ImportJobBase - Beginning import of game_session 7398 [uber-SubtaskRunner] WARN org.apache.sqoop.mapreduce.JobBase - SQOOP_HOME is unset. May not be able to find all job dependencies. 8187 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DBInputFormat - Using read commited transaction isolation 8211 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat - BoundingValsQuery: SELECT MIN(`session_id`), MAX(`session_id`) FROM `table` 8237 [uber-SubtaskRunner] INFO org.apache.sqoop.mapreduce.db.IntegerSplitter - Split size: 9811415567004; Num splits: 4 from: 14556292800030657 to: 14595538462298675 Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat Heart beat
The status is looping on "Heart beat" logs.
I don't know if it can come from memory configuration, or anything else...
Have you an idea about that ?
Created 04-03-2016 04:53 PM
It's not clear to me what is going on. What I recommend if no others have solutions, is to simplify the scenario and start eliminating variables. Some additional questions I have that may help:
Created 04-06-2016 03:29 PM
Hi tseader,
Sorry I wasn't avaiable !
For update, It works. The problem was the "Dynamic ressrouce pool".
I create a resource pool for my username, and now the job is starting and runing.
It was different from our Cloudera 4 in how it works...
So now the job is runing, doing the sqoop and the hive job, and terminate successfuly ! Great news!
But it very slow for a small table import, I think there is something to do in Dynamic resource pool or yarn setting to use more resource cause, during the job, cpu/emory of my 2 datanode was very less...
Maybe you can give me some informations on how to calculate the the max container possible ?
To give you some answer:
- Yes sqoop was working alone.
- Yes our analytics use <args> cause sometime in CDH4 with <command>, they were some error with specific caracters.
- Now yes, sqoop/oozie/hive works now. We will try Impala now
- No we doesn't try to create a workflow since Hue. I will see with our dev about that.
- Not, didn't try with another db.
As you thinking, the problem wasn't come from the workflow but the configuration.
I'm new in Cloudera/Hadoop, so I learn! I discover the configuration with time!
Now I've to find the best configuration to a better usage of our datanode...
Thanks again tseader!