Support Questions

MilesYao · ‎07-13-2015

Hi - I am composing some ETL workflows with the following pattern:

Run a Hive query to check for a process precondition (e.g. presence of new rows in staging tables, number of rows over a pre-defined threshold) - it will return a boolean or a simple scalar value (most often int)
Save the value within Oozie, most likely via wf:actionData(<action>)[<key>]
Have a decision node later on (may have other processing in between) branch the workflow based on the value

This seems to be a basic requirement. However, after much research, I learn that Hive action does not support output in Oozie - only via standard Hive output to table or file, which then requires another action to read from disk back into the WF. This seems rather high-overhead. So I tried the alternative of calling Hive CLI from shell action, capturing output from stdout and echoing it in "<key>=<value>" format to wf:actionData. Unfortunately, doing so led me to the known issue with MR2 bug ["java.io.FileNotFoundException: File does not exist: hdfs://<namenode>/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar"]

http://community.cloudera.com/t5/Cloudera-Manager-Installation/hive-hbase-handler-0-12-0-cdh5-0-0-be...

https://issues.apache.org/jira/browse/HIVE-5016 -> https://issues.apache.org/jira/browse/MAPREDUCE-6238

The bug fix is for Hadoop 2.7.1, so out of reach for us running CDH 5.2.0. I tried the suggested workarounds like manually setting hive.exec.mode.local.auto, hive.aux.jars.path, and $HIVE_AUX_JARS_PATH, but none worked. The correct -libjars value [file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop2-compat.jar,... ] seems to be generated and passed to hadoop in the MR2 job, but mangled within per the Jira.

So, is our option reduced to either pass value out through HDFS or query Hive via Java action? Your guidance is appreciated.

Miles Yao

MilesYao · ‎07-17-2015

Thanks ... here's what happened:

LinuxContainerExecutor was a dead-end. Set up all the config as recommended - Yarn switched to user 'nobody', but not the WF owner ('myao'). This put MR out of order cluster-wide - even Hive CLI stopped working. yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users = false has no effect ... I was kinda expecting that given that YARN-2424 was for Hadoop 2.6.0, and our CDH 5.2.0 has 2.5.0.
Your workaround #2, set HADOOP_USER_NAME=myao in shell action, succeeded. I was able to run "hive -e <query>" and capture the result.

To summarize, the minimal manual config to get shell action working normally as workflow owner for us:

HADOOP_CONF_DIR=/etc/hive/conf
HIVE_CONF_DIR=/etc/hive/conf
HADOOP_USER_NAME=${wf:user()}

Appreciate your advices. Please feel free to comment and share your experiences!

View solution in original post

Harsh J · ‎07-14-2015

Your error indicates a lack of proper local configuration for the Hive CLI to find and run.

For avoiding this issue, please add YARN and HIVE Gateway roles to all hosts running NodeManagers in your cluster, and deploy cluster-wide client configuration [1]. Subsequently, add the below lines to the top of your script (before the Hive command or other commands are invoked):

export HIVE_CONF_DIR=/etc/hive/conf
export HADOOP_CONF_DIR=/etc/hive/conf

For more on adding gateways via CM, see http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cm_mc_client_config.htm...

[1] - https://www.youtube.com/watch?v=4S9H3wftM_0

MilesYao · ‎07-15-2015

Hi Harsh:

Tried your suggestion. It led to a different problem though - shell action being run in Yarn as a different user ('yarn') as the workflow:

http://community.cloudera.com/t5/Batch-Processing-and-Workflow/How-to-run-Oozie-workfllow-or-action-...

I've already seen this issue and tried the workaround unsuccessfully (see my follow-up there) - Hadoop 2.6.0 seems to be required.

Not sure if the original path issue is solved - execution flow appears to be changed:

(/tmp/yarn/hive.log)
2015-07-13 14:58:37,158 INFO  [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(9285)) - Completed plan generation
2015-07-13 14:58:37,159 INFO  [main]: ql.Driver (Driver.java:compile(443)) - Semantic Analysis Completed
2015-07-13 14:58:37,176 INFO  [main]: ql.Driver (Driver.java:getSchema(242)) - Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
2015-07-13 14:58:37,336 INFO  [main]: ql.Driver (Driver.java:execute(1205)) - Starting command: SELECT count(1) FROM kestrel.NodeDetailsStg
2015-07-13 14:58:37,338 INFO  [main]: ql.Driver (SessionState.java:printInfo(536)) - Total jobs = 1
2015-07-13 14:58:37,354 INFO  [main]: ql.Driver (SessionState.java:printInfo(536)) - Launching Job 1 out of 1
2015-07-13 14:58:37,361 INFO  [main]: exec.Task (SessionState.java:printInfo(536)) - Number of reduce tasks determined at compile time: 1
...
2015-07-13 14:58:37,362 INFO  [main]: exec.Task (SessionState.java:printInfo(536)) - In order to set a constant number of reducers:
2015-07-13 14:58:37,362 INFO  [main]: exec.Task (SessionState.java:printInfo(536)) -   set mapreduce.job.reduces=<number>
2015-07-13 14:58:37,439 WARN  [main]: conf.HiveConf (HiveConf.java:initialize(1488)) - DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
2015-07-13 14:58:37,530 INFO  [main]: mr.ExecDriver (MapRedTask.java:execute(175)) - Generating plan file file:/tmp/yarn/hive_2015-07-13_14-58-35_934_4534563874187696548-1/-local-10004/plan.xml
2015-07-13 14:58:37,620 INFO  [main]: mr.ExecDriver (MapRedTask.java:execute(207)) - Executing: /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop jar /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/jars/hive-exec-0.13.1-cdh5.2.0.jar org.apache.hadoop.hive.ql.exec.mr.ExecDriver -libjars file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-server.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core-2.04.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop-compat.jar,file:///usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-server.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core-2.04.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop-compat.jar,file:///usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar  -plan file:/tmp/yarn/hive_2015-07-13_14-58-35_934_4534563874187696548-1/-local-10004/plan.xml   -jobconffile file:/tmp/yarn/hive_2015-07-13_14-58-35_934_4534563874187696548-1/-local-10003/jobconf.xml
2015-07-13 14:58:41,771 ERROR [main]: exec.Task (SessionState.java:printError(545)) - Execution failed with exit status: 1

(/tmp/yarn/yarn_20150713145858_101b44b0-86fd-4612-afdb-1d6efd945640.log)
2015-07-13 14:58:40,394 INFO  [main]: mr.ExecDriver (ExecDriver.java:execute(307)) - adding libjars: file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-server.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core-2.04.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop-compat.jar,file:///usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-server.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core-2.04.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop-compat.jar,file:///usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar
2015-07-13 14:58:40,396 INFO  [main]: exec.Utilities (Utilities.java:getInputPaths(2993)) - Processing alias nodedetailsstg
2015-07-13 14:58:40,397 INFO  [main]: exec.Utilities (Utilities.java:getInputPaths(3010)) - Adding input file hdfs://<namenode>/user/hive/data/ND
2015-07-13 14:58:40,397 INFO  [main]: exec.Utilities (Utilities.java:isEmptyPath(2339)) - Content Summary not cached for hdfs://<namenode>/user/hive/data/ND
...
2015-07-13 14:58:41,127 INFO  [main]: jvm.JvmMetrics (JvmMetrics.java:init(76)) - Initializing JVM Metrics with processName=JobTracker, sessionId=
2015-07-13 14:58:41,144 INFO  [main]: jvm.JvmMetrics (JvmMetrics.java:init(71)) - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
2015-07-13 14:58:41,156 WARN  [main]: mapreduce.JobSubmitter (JobSubmitter.java:copyAndConfigureFiles(150)) - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2015-07-13 14:58:41,299 INFO  [main]: mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(446)) - Cleaning up the staging area file:/tmp/hadoop-yarn/mapred/staging/yarn315976574/.staging/job_local315976574_0001
2015-07-13 14:58:41,301 ERROR [main]: mr.ExecDriver (SessionState.java:printError(545)) - Job Submission failed with exception 'java.io.FileNotFoundException(File does not exist: hdfs://<namenode>/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar)'
java.io.FileNotFoundException: File does not exist: hdfs://<namenode>/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar




(tmp/yarn/hive.log)
2015-07-15 13:59:50,066 INFO  [main]: parse.SemanticAnalyzer (SemanticAnalyzer.java:analyzeInternal(9285)) - Completed plan generation
2015-07-15 13:59:50,067 INFO  [main]: ql.Driver (Driver.java:compile(443)) - Semantic Analysis Completed
2015-07-15 13:59:50,083 INFO  [main]: ql.Driver (Driver.java:getSchema(242)) - Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
2015-07-15 13:59:50,219 INFO  [main]: ql.Driver (Driver.java:execute(1205)) - Starting command: SELECT count(1) FROM kestrel.NodeDetailsStg
2015-07-15 13:59:50,220 INFO  [main]: ql.Driver (SessionState.java:printInfo(536)) - Total jobs = 1
2015-07-15 13:59:50,237 INFO  [main]: ql.Driver (SessionState.java:printInfo(536)) - Launching Job 1 out of 1
2015-07-15 13:59:50,243 INFO  [main]: exec.Task (SessionState.java:printInfo(536)) - Number of reduce tasks determined at compile time: 1
...
2015-07-15 13:59:50,244 INFO  [main]: exec.Task (SessionState.java:printInfo(536)) - In order to set a constant number of reducers:
2015-07-15 13:59:50,244 INFO  [main]: exec.Task (SessionState.java:printInfo(536)) -   set mapreduce.job.reduces=<number>
2015-07-15 13:59:50,263 INFO  [main]: mr.ExecDriver (ExecDriver.java:execute(285)) - Using org.apache.hadoop.hive.ql.io.CombineHiveInputFormat
2015-07-15 13:59:50,266 INFO  [main]: mr.ExecDriver (ExecDriver.java:execute(307)) - adding libjars: file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-server.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core-2.04.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop-compat.jar,file:///usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop2-compat.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-server.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-common.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-protocol.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-client.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core-2.04.jar,file:///opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop-compat.jar,file:///usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar
2015-07-15 13:59:51,122 WARN  [main]: mapreduce.JobSubmitter (JobSubmitter.java:copyAndConfigureFiles(150)) - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2015-07-15 13:59:55,957 INFO  [main]: exec.Task (SessionState.java:printInfo(536)) - Starting Job = job_1436812963995_0042, Tracking URL = http://<resource mgr>/proxy/application_1436812963995_0042/
2015-07-15 13:59:55,958 INFO  [main]: exec.Task (SessionState.java:printInfo(536)) - Kill Command = /opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hadoop/bin/hadoop job  -kill job_1436812963995_0042
2015-07-15 14:00:08,173 INFO  [main]: exec.Task (SessionState.java:printInfo(536)) - Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0
2015-07-15 14:00:08,255 WARN  [main]: mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2015-07-15 14:00:08,255 INFO  [main]: exec.Task (SessionState.java:printInfo(536)) - 2015-07-15 14:00:08,253 Stage-1 map = 0%,  reduce = 0%
2015-07-15 14:00:08,261 WARN  [main]: mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2015-07-15 14:00:08,447 ERROR [main]: exec.Task (SessionState.java:printError(545)) - Ended Job = job_1436812963995_0042 with errors2015-07-15 14:00:08,448 ERROR [Thread-64]: exec.Task (SessionState.java:printError(545)) - Error during job, obtaining debugging information...
2015-07-15 14:00:08,587 ERROR [main]: ql.Driver (SessionState.java:printError(545)) - FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
2015-07-15 14:00:08,587 INFO  [main]: ql.Driver (SessionState.java:printInfo(536)) - MapReduce Jobs Launched: 
2015-07-15 14:00:08,590 WARN  [main]: mapreduce.Counters (AbstractCounters.java:getGroup(234)) - Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
2015-07-15 14:00:08,591 INFO  [main]: ql.Driver (SessionState.java:printInfo(536)) - Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 FAIL
2015-07-15 14:00:08,591 INFO  [main]: ql.Driver (SessionState.java:printInfo(536)) - Total MapReduce CPU Time Spent: 0 msec


(http://<job hist server>/jobhistory/job/job_1436812963995_0042/)
2015-07-15 14:00:03,817 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Kind: RM_DELEGATION_TOKEN, Service: 10.28.74.72:8032, Ident: (owner=myao, renewer=oozie mr token, realUser=oozie, issueDate=1436986766618, maxDate=1437591566618, sequenceNumber=64, masterKeyId=4)
...
2015-07-15 14:00:06,043 INFO [eventHandlingThread] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Event Writer setup for JobId: job_1436812963995_0042, File: hdfs://<namenode>/user/myao/.staging/job_1436812963995_0042/job_1436812963995_0042_1.jhist
...
2015-07-15 14:00:06,863 INFO [Thread-54] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Setting job diagnostics to Job init failed : org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://<namenode>/user/myao/.staging/job_1436812963995_0042/job.splitmetainfoat org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1566)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1430) 
...
2015-07-15 14:00:07,873 INFO [Thread-54] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://<namenode>/user/yarn/.staging/job_1436812963995_0042

Note that the Hive execution plan was generated and the final "hadoop jar ..." command line logged in the old run,

but was replaced with "adding libjars: /starting job" instead. The last log msg in common was "set mapreduce.job.reduces=<number>".

The new run also did not generate /tmp/yarn/yarn_*.log.

Ours is a non-Kerberized environment.

Before, YARN_CONF_DIR=/run/cloudera-scm-agent/process/416-yarn-NODEMANAGER: , which points to a different yarn-site.xml (with yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users=true). Overriding it to /etc/hive/conf doesn't seem to make a difference.

Here is the workflow.xml:

<workflow-app name="Test_ND_ETL_Wf1" xmlns="uri:oozie:workflow:0.4">
  <global>
      <job-xml>/user/hue/oozie/workspaces/kestrel/hive-site.xml</job-xml>
            <configuration>
                <property>
                    <name>nameNode</name>
                    <value>...</value>
                </property>
                <property>
                    <name>jobTracker</name>
                    <value>...</value>
                </property>
                <property>
                    <name>queueName</name>
                    <value>default</value>
                </property>
            </configuration>
  </global>
    <start to="CheckExistingND"/>
    <action name="CheckExistingND">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
              <job-xml>/user/hue/oozie/workspaces/kestrel/hive-site.xml</job-xml>
            <configuration>
                <property>
                    <name>hive.exec.mode.local.auto</name>
                    <value>false</value>
                </property>
                <property>
                    <name>hive.aux.jars.path</name>
                    <value>;</value>
                </property>
                <property>
                    <name>yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users</name>
                    <value>true</value>
                </property>
                <property>
                    <name>mapreduce.jobtracker.staging.root.dir</name>
                    <value>/user</value>
                </property>
            </configuration>
            <exec>queryHive4Oozie.sh</exec>
              <env-var>HQUERY=SELECT count(1) FROM schema.table </env-var>
              <env-var>HIVE_AUX_JARS_PATH=/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop2-compat.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-server.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-common.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-protocol.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-client.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core-2.04.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop-compat.jar,/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar</env-var>
              <env-var>HIVE_CONF_DIR=/etc/hive/conf</env-var>
              <env-var>HADOOP_CONF_DIR=/etc/hive/conf</env-var>
              <env-var>YARN_CONF_DIR=/etc/hive/conf</env-var>
            <file>queryHive4Oozie.sh#queryHive4Oozie.sh</file>
            <file>hive_env.sh#hive_env.sh</file>
              <capture-output/>
        </shell>
        <ok to="NotifyEmail"/>
        <error to="kill"/>
    </action>
    <action name="NotifyEmail">
        <email xmlns="uri:oozie:email-action:0.1">
            <to>...</to>
            <subject>Workflow ${wf:name()} aborted</subject>
            <body>WARNING - Workflow #${wf:id()} (${wf:name()}) submitted by ${wf:user()} aborted.
.
Reason:  ${wf:actionData(&#39;CheckExistingND&#39;)[&#39;output&#39;]} existing records found that will be overwritten.
.
Query:  ${wf:actionData(&#39;CheckExistingND&#39;)[&#39;HQUERY&#39;]}
</body>
        </email>
        <ok to="end"/>
        <error to="kill"/>
    </action>
    <kill name="kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

Thanks for helping. Please let me know if you need any other details.

Miles

MilesYao · ‎07-15-2015

Also, YARN_CONF_DIR defaulted to /run/cloudera-scm-agent/process/416-yarn-NODEMANAGER: , which pointed to a different yarn-site.xml (with my manual setting yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users = true). Overriding it to /etc/hive/conf doesn't seem to help.

Here is the workflow.xml:

<workflow-app name="Test_ND_ETL_Wf1" xmlns="uri:oozie:workflow:0.4">
  <global>
      <job-xml>/user/hue/oozie/workspaces/kestrel/hive-site.xml</job-xml>
            <configuration>
                <property>
                    <name>nameNode</name>
                    <value>...</value>
                </property>
                <property>
                    <name>jobTracker</name>
                    <value>...</value>
                </property>
                <property>
                    <name>queueName</name>
                    <value>default</value>
                </property>
            </configuration>
  </global>
    <start to="CheckExistingND"/>
    <action name="CheckExistingND">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
              <job-xml>/user/hue/oozie/workspaces/kestrel/hive-site.xml</job-xml>
            <configuration>
                <property>
                    <name>hive.exec.mode.local.auto</name>
                    <value>false</value>
                </property>
                <property>
                    <name>hive.aux.jars.path</name>
                    <value>;</value>
                </property>
                <property>
                    <name>yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users</name>
                    <value>true</value>
                </property>
                <property>
                    <name>mapreduce.jobtracker.staging.root.dir</name>
                    <value>/user</value>
                </property>
            </configuration>
            <exec>queryHive4Oozie.sh</exec>
              <env-var>HQUERY=SELECT count(1) FROM schema.table </env-var>
              <env-var>HIVE_AUX_JARS_PATH=/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop2-compat.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-server.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-common.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-protocol.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-client.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core-2.04.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop-compat.jar,/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar</env-var>
              <env-var>HIVE_CONF_DIR=/etc/hive/conf</env-var>
              <env-var>HADOOP_CONF_DIR=/etc/hive/conf</env-var>
              <env-var>YARN_CONF_DIR=/etc/hive/conf</env-var>
            <file>queryHive4Oozie.sh#queryHive4Oozie.sh</file>
            <file>hive_env.sh#hive_env.sh</file>
              <capture-output/>
        </shell>
        <ok to="NotifyEmail"/>
        <error to="kill"/>
    </action>
    <action name="NotifyEmail">
        <email xmlns="uri:oozie:email-action:0.1">
            <to>...</to>
            <subject>Workflow ${wf:name()} aborted</subject>
            <body>WARNING - Workflow #${wf:id()} (${wf:name()}) submitted by ${wf:user()} aborted.
.
Reason:  ${wf:actionData(&#39;CheckExistingND&#39;)[&#39;output&#39;]} existing records found that will be overwritten.
.
Query:  ${wf:actionData(&#39;CheckExistingND&#39;)[&#39;HQUERY&#39;]}
</body>
        </email>
        <ok to="end"/>
        <error to="kill"/>
    </action>
    <kill name="kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

And here the job output:

  >>> Invoking Shell command line now >>
  
  Stdoutput host=hou711071
  Stdoutput login=yarn
  Stdoutput shell="/bin/bash"
  Stdoutput PATH=".:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
  Stdoutput HQUERY="SELECT count(1) FROM schema.table1"
  Stdoutput HIVEPATH=/usr/bin/hive
  Stdoutput HIVE_CONF_DIR=/etc/hive/conf
  Stdoutput HADOOP_CONF_DIR=/etc/hive/conf
  Stdoutput YARN_CONF_DIR=/etc/hive/conf
  Stdoutput HIVE_AUX_JARS_PATH=/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hive/lib/hive-hbase-handler-0.13.1-cdh5.2.0.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop2-compat.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-server.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-common.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-protocol.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-client.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/lib/htrace-core-2.04.jar,/opt/cloudera/parcels/CDH-5.2.0-1.cdh5.2.0.p0.36/lib/hbase/hbase-hadoop-compat.jar,/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar
  Stdoutput HADOOP_CLASSPATH=/usr/share/cmf/lib/plugins/event-publish-5.3.2-shaded.jar:/usr/share/cmf/lib/plugins/tt-instrumentation-5.3.2.jar:/usr/share/cmf/lib/plugins/cdh5/audit-plugin-cdh5-2.2.2-shaded.jar
  Stdoutput ERROR - Hive query [SELECT count(1) FROM schema.table1] failed!

Please let me know if you need any other info. Thanks again!

Harsh J · ‎07-15-2015

I was expecting you may talk about that next, given a non-secure cluster 🙂

This is a far more trickier problem to solve. You could, for instance, enable the LinuxContainerExecutor and set yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users (as false) on the NM configs without turning on security, and this will cause your tasks to run as the submitting UID. However, for that to completely work, all your nodes will need that very same user to be locally available (such as via a regular unix account, or via LDAP/etc.). Would this be possible in your situation?

If not, I'd recommend setting also the below at top of you script, which will work in non-secure clusters:

export HADOOP_USER_NAME=myao

MilesYao · ‎07-17-2015

Thanks ... here's what happened:

LinuxContainerExecutor was a dead-end. Set up all the config as recommended - Yarn switched to user 'nobody', but not the WF owner ('myao'). This put MR out of order cluster-wide - even Hive CLI stopped working. yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users = false has no effect ... I was kinda expecting that given that YARN-2424 was for Hadoop 2.6.0, and our CDH 5.2.0 has 2.5.0.
Your workaround #2, set HADOOP_USER_NAME=myao in shell action, succeeded. I was able to run "hive -e <query>" and capture the result.

To summarize, the minimal manual config to get shell action working normally as workflow owner for us:

HADOOP_CONF_DIR=/etc/hive/conf
HIVE_CONF_DIR=/etc/hive/conf
HADOOP_USER_NAME=${wf:user()}

Appreciate your advices. Please feel free to comment and share your experiences!

Harsh J · ‎07-17-2015

Glad to hear the simpler approach worked. Please consider marking the post
as the solution on the thread so others with the same issue may find a
solved thread faster!

Could I also know where the configuration was applied for the other users
allowance? Did you place it into the NM yarn-site.xml?

MilesYao · ‎07-20-2015

Harsh:

I tried yarn.nodemanager.linux-container-executor.nonsecure-mode.limit-users = false to the yarn-site.xml Safety Valve under either "NodeManager Default Group" or "Service-Wide" - neither worked.

Would you be able to confirm my point about the Hadoop release version for YARN-2424?

Thanks,

Miles

Harsh J · ‎07-20-2015

While CDH releases are not vanilla Apache Hadoop ones (they carry
additional backports), you are correct in that YARN-2424 isn't available in
CDH 5.2.0.

My bad for having you try that. It is available from CDH 5.3.0 onwards.

Thanks for taking the time to mark a solution!

MilesYao · ‎08-27-2015

I retried the LCE configs after upgrading to CDH 5.4.4 - still doesn't work. Both Hive-on-Hue and Oozie failed in MR. One of the nodes have cache directory reconfigured:

drwxr-s--- 4 myao yarn 37 Mar 20 11:30 /s0/yarn/nm/usercache/myao
drwxr-s--- 4 myao yarn 37 Mar 20 11:30 /s1/yarn/nm/usercache/myao
drwxr-s--- 4 myao yarn 37 Mar 20 11:30 /s2/yarn/nm/usercache/myao

So, I will stick with the $HADOOP_USER_NAME workaround for now.

Cheers,

Miles

Cloudera Community

Support Questions

Cannot pass value from Hive query output directly in workflow