Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

MapReduceIndexerTool not indexing inside oozie shell action

Highlighted

MapReduceIndexerTool not indexing inside oozie shell action

New Contributor

Hi everyone,

 

I am trying to index some data on SolR from hdfs working on the Quickstart docker container.

 

If I launch the following script from container command line, everything works as expected and data are correctly ingested:

 

 

#!/bin/bash

echo "start"

hadoop jar /usr/lib/solr/contrib/mr/search-mr-1.0.0-cdh5.7.0-job.jar  \
org.apache.solr.hadoop.MapReduceIndexerTool -D 'mapred.child.java.opts=-Xmx500m' \
--morphline-file morphline1.conf \
--output-dir hdfs://quickstart:8020/user/cloudera/solr_out \
--verbose \
--go-live \
--zk-host quickstart:2181/solr \
--collection my_collection \
--log4j log4j.properties \
hdfs://quickstart:8020/user/hive/warehouse/my_table_rep echo "stop"

 

If I put this script into a shell action, the job fails and I can see the following (not-so-specific) logs on stdout:

2019-02-11 09:13:32,907 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1356)) - Running job: job_1549647336050_0011
2019-02-11 09:13:46,386 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1377)) - Job job_1549647336050_0011 running in uber mode : false
2019-02-11 09:13:46,388 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1384)) -  map 0% reduce 0%
2019-02-11 09:13:46,537 INFO  [main] mapred.ClientServiceDelegate (ClientServiceDelegate.java:getProxy(276)) - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
2019-02-11 09:13:46,607 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1397)) - Job job_1549647336050_0011 failed with state FAILED due to: 
2019-02-11 09:13:46,626 ERROR [main] hadoop.MapReduceIndexerTool (MapReduceIndexerTool.java:waitForCompletion(1436)) - Job failed! jobName: org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper, jobId: job_1549647336050_0011

Here's the xml:

<workflow-app name="solr_try" xmlns="uri:oozie:workflow:0.5">
    <start to="shell-b9b9"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="shell-b9b9">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>oozie.action.max.output.data</name>
                    <value>1000000</value>
                </property>
            </configuration>
            <exec>/user/cloudera/my_scripts/try.sh</exec>
              <argument>try</argument>
            <file>/user/cloudera/my_scripts/try.sh#try.sh</file>
            <file>/user/cloudera/my_conf/morphline1.conf#morphline1.conf</file>
            <file>/user/cloudera/my_conf/log4j.properties#log4j.properties</file>
              <capture-output/>
        </shell>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <end name="End"/>
</workflow-app>

 

I would like to understand better what is going on, can someone help me?

Thanks in advance,

Andrea