Support Questions

fmorcamp · ‎03-29-2016

Hello everyone,

I come with an error from one of our job which is not really explicit...

I find another topic about the same error but seems to haven't the same origin.

To reproduce the problem I start from my VM this oozie job (we have a standalone labs in Cloudera 5.5.2 remote server)

The command to start the job:

oozie job -oozie http://host.domain.com:11000/oozie -config config-default.xml -run

The content of config-default.xml file:

<configuration>
	<property><name>job_tracker</name><value>host.domain.com:8032</value></property>
	<property><name>job_xml</name><value>/path/to/file/hive-site.xml</value></property>
	<property><name>name_node</name><value>hdfs://host.domain.com:8020</value></property>
	<property><name>oozie.libpath</name><value>${name_node}/user/oozie/share/lib/lib_20160216173849</value></property>
 	<property><name>oozie.use.system.libpath</name><value>true</value></property>
	<property><name>oozie.wf.application.path</name><value>${name_node}/path/to/file/simple-etl-wf.xml</value></property>
	<property><name>db_user</name><value>user</value></property>
	<property><name>db_pass</name><value>password</value></property>
	<property><name>target_dir</name><value>/path/to/destination</value></property>
	<property><name>hive_db_schema</name><value>default</value></property>
	<property><name>table_suffix</name><value>specific_suffix</value></property>
</configuration>

Try to set the "job"tracker" with http:// but we have the same error.

The content of the simple-etl-wf.xml file

<workflow-app xmlns="uri:oozie:workflow:0.5" name="simple-etl-wf">
    <global>
	<job-tracker>${name_node}</job-tracker>
	<name-node>${job_tracker}</name-node>
	<job-xml>${job_xml}</job-xml>
    </global>


    <start to="extract"/>

    <fork name="extract">
	<path start="table" />
    </fork>

    <action name="table">
        <sqoop xmlns="uri:oozie:sqoop-action:0.4">
          <arg>import</arg>
	  <arg>--connect</arg>
	  <arg>jdbc:mysql://db.domain.com/database</arg>
	  <arg>username</arg>
	  <arg>${db_user}</arg>
	  <arg>password</arg>
	  <arg>${db_pass}</arg>
 	  <arg>--table</arg>
	  <arg>table</arg>
	  <arg>--target-dir</arg>
	  <arg>${target_dir}/table</arg>
	  <arg>--split-by</arg>
	  <arg>column</arg>
	  <arg>--hive-import</arg>
	  <arg>--hive-overwrite</arg>
	  <arg>--hive-table</arg>
 	  <arg>${hive_db_schema}.table_${table_suffix}</arg>
        </sqoop>
        <ok to="join"/>
        <error to="fail"/>
    </action>
    
    <join name="join" to="transform" />

    <action name="transform">
        <hive xmlns="uri:oozie:hive-action:0.4">
           <script>script.hql</script>
 	   <param>hive_db_schema=${hive_db_schema}</param>
	   <param>table_suffix=${table_suffix}</param>
        </hive>
        <ok to="end"/>
        <error to="fail"/>
    </action>
    
    <kill name="fail">
        <message>Hive failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

The job is start, but it block to 20% about. And we have this error:

JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.

2016-03-29 15:45:17,149 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[host.domain.com] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000004-160325161246127-oozie-oozi-W] ACTION[0000004-160325161246127-oozie-oozi-W@session] Error starting action [session]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.]
org.apache.oozie.action.ActionExecutorException: JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
	at org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:454)
	at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:434)
	at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1032)
	at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1203)
	at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
	at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
	at org.apache.oozie.command.XCommand.call(XCommand.java:286)
	at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
	at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
	at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
	at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
	at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
	at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
	at org.apache.hadoop.mapred.JobClient.init(JobClient.java:472)
	at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:450)
	at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:436)
	at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:434)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:434)
	at org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1246)
	at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:980)
	... 10 more

Or the job_tracker and name_node have the good url and path. The mysql-connector-java.jar is present in the sharlib folder.

I put oozie un debug mode but no more information about that.

The "mapreduce.framework.name" is set to "yarn on each xml configuration file the the cluster."

Have you any idea about this error ?

fmorcamp · ‎04-06-2016

Hi tseader,

Sorry I wasn't avaiable !

For update, It works. The problem was the "Dynamic ressrouce pool".

I create a resource pool for my username, and now the job is starting and runing.

It was different from our Cloudera 4 in how it works...

So now the job is runing, doing the sqoop and the hive job, and terminate successfuly ! Great news!

But it very slow for a small table import, I think there is something to do in Dynamic resource pool or yarn setting to use more resource cause, during the job, cpu/emory of my 2 datanode was very less...

Maybe you can give me some informations on how to calculate the the max container possible ?

To give you some answer:

- Yes sqoop was working alone.

- Yes our analytics use <args> cause sometime in CDH4 with <command>, they were some error with specific caracters.

- Now yes, sqoop/oozie/hive works now. We will try Impala now

- No we doesn't try to create a workflow since Hue. I will see with our dev about that.

- Not, didn't try with another db.

As you thinking, the problem wasn't come from the workflow but the configuration.

I'm new in Cloudera/Hadoop, so I learn! I discover the configuration with time!

Now I've to find the best configuration to a better usage of our datanode...

Thanks again tseader!

View solution in original post

tseader · ‎03-31-2016

Just to check, shouldn't the username and password have double-hyphens in the Sqoop args or does it not matter?

Just want to eliminate any confounding variables 🙂

fmorcamp · ‎03-31-2016

Hi tseader ! Thanks for your help !

Good eyes! Yes I think it could be a error ans stop the process. I modify these 2 settings but the problem still present.

It seems to be before.

It seems to read this file, but never start the mysql connection process.

I have something else in oozie logs.

When I launch the command on my VM, the workflow appears in Hue.

But the log start with theses 2 lines:

2016-03-31 19:04:18,709 WARN org.apache.oozie.util.ParameterVerifier: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition
2016-03-31 19:04:18,744 WARN org.apache.oozie.service.LiteWorkflowAppService: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] libpath [hdfs://hostname.domain.com:8020/path/to/oozie/lib] does not exist

Or its not the libpath that I give in th my job file...

The complete log, since the job is started to the end.

2016-03-31 19:04:18,709 WARN org.apache.oozie.util.ParameterVerifier: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] The application does not define formal parameters in its XML definition
2016-03-31 19:04:18,744 WARN org.apache.oozie.service.LiteWorkflowAppService: SERVER[hostname] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] libpath [hdfs://hostname.domain.com:8020/path/to/oozie/lib] does not exist
2016-03-31 19:04:18,805 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@:start:] Start action [0000001-160331185825562-oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-03-31 19:04:18,809 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@:start:] [***0000001-160331185825562-oozie-oozi-W@:start:***]Action status=DONE
2016-03-31 19:04:18,809 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@:start:] [***0000001-160331185825562-oozie-oozi-W@:start:***]Action updated in DB!
2016-03-31 19:04:18,898 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@extract] Start action [0000001-160331185825562-oozie-oozi-W@extract] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-03-31 19:04:18,907 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@extract] [***0000001-160331185825562-oozie-oozi-W@extract***]Action 
2016-03-31 19:04:18,907 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@extract] [***0000001-160331185825562-oozie-oozi-W@extract***]Action updated in DB!
2016-03-31 19:04:19,077 INFO org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@session] Start action [0000001-160331185825562-oozie-oozi-W@session] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2016-03-31 19:04:22,804 WARN org.apache.hadoop.security.UserGroupInformation: SERVER[hostname] PriviledgedActionException as:username (auth:PROXY) via oozie (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: httpstatus=DONE
2016-03-31 19:04:22,805 WARN org.apache.hadoop.security.UserGroupInformation: SERVER[hostname] PriviledgedActionException as:username (auth:PROXY) via oozie (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
2016-03-31 19:04:22,805 WARN org.apache.oozie.command.wf.ActionStartXCommand: SERVER[hostname] USER[username] GROUP[-] TOKEN[] APP[simple-etl-wf] JOB[0000001-160331185825562-oozie-oozi-W] ACTION[0000001-160331185825562-oozie-oozi-W@session] Error starting action [session]. ErrorType [TRANSIENT], ErrorCode [JA009], Message [JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.]
org.apache.oozie.action.ActionExecutorException: JA009: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
        at org.apache.oozie.action.ActionExecutor.convertExceptionHelper(ActionExecutor.java:454)
        at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:434)
        at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:1032)
        at org.apache.oozie.action.hadoop.JavaActionExecutor.start(JavaActionExecutor.java:1203)
        at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:250)
        at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:64)
        at org.apache.oozie.command.XCommand.call(XCommand.java:286)
        at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:321)
        at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:250)
        at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
        at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
        at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
        at org.apache.hadoop.mapred.JobClient.init(JobClient.java:472)
        at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:450)
        at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:436)
        at org.apache.oozie.service.HadoopAccessorService$3.run(HadoopAccessorService.java:434)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.oozie.service.HadoopAccessorService.createJobClient(HadoopAccessorService.java:434)
        at org.apache.oozie.action.hadoop.JavaActionExecutor.createJobClient(JavaActionExecutor.java:1246)
        at org.apache.oozie.action.hadoop.JavaActionExecutor.submitLauncher(JavaActionExecutor.java:980)

Maybe it can give you an idea !

I check hostname in configuration file, but it seems to be ok.

This error message is not very clear...

tseader · ‎03-31-2016

The config-default.xml has "host.domain.com" because you wanted to generalize it, right? I'm assuming you've tried localhost with the proper port in your job_tracker and name_node values?

fmorcamp · ‎04-01-2016

Yes we always used the real fqdn to start job.

And we find the mistake. The "job_tracker" was on "name_node" job and vice-versa...

So we change them to be valid, and the process start but it doesn't complish anymore.

We see the worklows in "running" status, and see a job "oozie:launcher" on running state. It create an "oozie:action" task, but this last stay on the "accepted" status. And I don't find why.

I try some settings in Yarn memory configuration with no success.

In RessourceManager, I can find this log about he job:

>>> Invoking Sqoop command line now >>>

4624 [uber-SubtaskRunner] WARN  org.apache.sqoop.tool.SqoopTool  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
4654 [uber-SubtaskRunner] INFO  org.apache.sqoop.Sqoop  - Running Sqoop version: 1.4.6-cdh5.5.2
4671 [uber-SubtaskRunner] WARN  org.apache.sqoop.tool.BaseSqoopTool  - Setting your password on the command-line is insecure. Consider using -P instead.
4672 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.BaseSqoopTool  - Using Hive-specific delimiters for output. You can override
4672 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.BaseSqoopTool  - delimiters with --fields-terminated-by, etc.
4690 [uber-SubtaskRunner] WARN  org.apache.sqoop.ConnFactory  - $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
4816 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.MySQLManager  - Preparing to use a MySQL streaming resultset.
4820 [uber-SubtaskRunner] INFO  org.apache.sqoop.tool.CodeGenTool  - Beginning code generation
5360 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.SqlManager  - Executing SQL statement: SELECT t.* FROM `table` AS t LIMIT 1
5521 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.SqlManager  - Executing SQL statement: SELECT t.* FROM `table` AS t LIMIT 1
5616 [uber-SubtaskRunner] INFO  org.apache.sqoop.orm.CompilationManager  - HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
7274 [uber-SubtaskRunner] INFO  org.apache.sqoop.orm.CompilationManager  - Writing jar file: /tmp/sqoop-yarn/compile/f695dd68db2ed1ecf703a5405d308df5/table.jar
7282 [uber-SubtaskRunner] WARN  org.apache.sqoop.manager.MySQLManager  - It looks like you are importing from mysql.
7282 [uber-SubtaskRunner] WARN  org.apache.sqoop.manager.MySQLManager  - This transfer can be faster! Use the --direct
7282 [uber-SubtaskRunner] WARN  org.apache.sqoop.manager.MySQLManager  - option to exercise a MySQL-specific fast path.
7282 [uber-SubtaskRunner] INFO  org.apache.sqoop.manager.MySQLManager  - Setting zero DATETIME behavior to convertToNull (mysql)
7284 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.ImportJobBase  - Beginning import of game_session
7398 [uber-SubtaskRunner] WARN  org.apache.sqoop.mapreduce.JobBase  - SQOOP_HOME is unset. May not be able to find all job dependencies.
8187 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.db.DBInputFormat  - Using read commited transaction isolation
8211 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.db.DataDrivenDBInputFormat  - BoundingValsQuery: SELECT MIN(`session_id`), MAX(`session_id`) FROM `table`
8237 [uber-SubtaskRunner] INFO  org.apache.sqoop.mapreduce.db.IntegerSplitter  - Split size: 9811415567004; Num splits: 4 from: 14556292800030657 to: 14595538462298675
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat
Heart beat

The status is looping on "Heart beat" logs.

I don't know if it can come from memory configuration, or anything else...

Have you an idea about that ?

tseader · ‎04-03-2016

It's not clear to me what is going on. What I recommend if no others have solutions, is to simplify the scenario and start eliminating variables. Some additional questions I have that may help:

I'm assuming this sqoop import works outside of Oozie?
Just curious, why do you have the sqoop command in <args> and not <command>? I don't see a free-form query so <args> isn't needed. (just curious) 🙂
Do other action types work fine?
Have you tried using Hue to create a workflow with the sqoop action, and see if that works? Hue will generate the workflow and job.properties for you, and that might give you something to compare to in order to focus in on where the problem lies.
Have you tried connecting to a different database to see if that has problems as well?
(thinking in text here) Because you can reproduce the problem in two environments, I'm thinking it's more than likely a problem with the workflow itself, and less of a problem with the infrastructure management, unless both environments are mirrors of each other in Cloudera Manager or something.

fmorcamp · ‎04-06-2016