Reply
Highlighted
New Contributor
Posts: 1
Registered: ‎02-11-2019

MapReduceIndexerTool not indexing inside oozie shell action

[ Edited ]

Hi everyone,

 

I am trying to index some data on SolR from hdfs working on the Quickstart docker container.

 

If I launch the following script from container command line, everything works as expected and data are correctly ingested:

 

 

#!/bin/bash

echo "start"

hadoop jar /usr/lib/solr/contrib/mr/search-mr-1.0.0-cdh5.7.0-job.jar  \
org.apache.solr.hadoop.MapReduceIndexerTool -D 'mapred.child.java.opts=-Xmx500m' \
--morphline-file morphline1.conf \
--output-dir hdfs://quickstart:8020/user/cloudera/solr_out \
--verbose \
--go-live \
--zk-host quickstart:2181/solr \
--collection my_collection \
--log4j log4j.properties \
hdfs://quickstart:8020/user/hive/warehouse/my_table_rep echo "stop"

 

If I put this script into a shell action, the job fails and I can see the following (not-so-specific) logs on stdout:

2019-02-11 09:13:32,907 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1356)) - Running job: job_1549647336050_0011
2019-02-11 09:13:46,386 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1377)) - Job job_1549647336050_0011 running in uber mode : false
2019-02-11 09:13:46,388 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1384)) -  map 0% reduce 0%
2019-02-11 09:13:46,537 INFO  [main] mapred.ClientServiceDelegate (ClientServiceDelegate.java:getProxy(276)) - Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
2019-02-11 09:13:46,607 INFO  [main] mapreduce.Job (Job.java:monitorAndPrintJob(1397)) - Job job_1549647336050_0011 failed with state FAILED due to: 
2019-02-11 09:13:46,626 ERROR [main] hadoop.MapReduceIndexerTool (MapReduceIndexerTool.java:waitForCompletion(1436)) - Job failed! jobName: org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper, jobId: job_1549647336050_0011

Here's the xml:

<workflow-app name="solr_try" xmlns="uri:oozie:workflow:0.5">
    <start to="shell-b9b9"/>
    <kill name="Kill">
        <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <action name="shell-b9b9">
        <shell xmlns="uri:oozie:shell-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>oozie.action.max.output.data</name>
                    <value>1000000</value>
                </property>
            </configuration>
            <exec>/user/cloudera/my_scripts/try.sh</exec>
              <argument>try</argument>
            <file>/user/cloudera/my_scripts/try.sh#try.sh</file>
            <file>/user/cloudera/my_conf/morphline1.conf#morphline1.conf</file>
            <file>/user/cloudera/my_conf/log4j.properties#log4j.properties</file>
              <capture-output/>
        </shell>
        <ok to="End"/>
        <error to="Kill"/>
    </action>
    <end name="End"/>
</workflow-app>

 

I would like to understand better what is going on, can someone help me?

Thanks in advance,

Andrea

 

Announcements