Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

[FALCON] Pipeline is failing

Highlighted

[FALCON] Pipeline is failing

New Contributor

Hi guys,

I am having trouble with this Falcon pipeline.

The Feeds are running well, but the Process never ends and when it is running it is consuming too much Yarn memory, but it is a little weird cause the Process is running a simple pig script.

Feeds xml:

<feed xmlns='uri:falcon:feed:0.1' name='input' description='input'>
  <tags>input=input</tags>
  <groups>input</groups>
  <frequency>hours(1)</frequency>
  <timezone>UTC</timezone>
  <clusters>
    <cluster name='primaryCluster' type='source'>
      <validity start='2016-06-07T17:00Z' end='2016-06-08T16:35Z'/>
      <retention limit='months(1)' action='delete'/>
      <locations>
        <location type='data'>
        </location>
        <location type='stats'>
        </location>
        <location type='meta'>
        </location>
      </locations>
    </cluster>
  </clusters>
  <locations>
    <location type='data' path='/user/ambari-qa/falcon/input/${YEAR}-${MONTH}-${DAY}-${HOUR}'>
    </location>
    <location type='stats' path='/'>
    </location>
    <location type='meta' path='/'>
    </location>
  </locations>
  <ACL owner='ambari-qa' group='users' permission='0755'/>
  <schema location='/none' provider='/none'/>
  <properties>
    <property name='jobPriority' value='NORMAL'>
    </property>
    <property name='timeout' value='minutes(3)'>
    </property>
  </properties>
</feed>
<feed xmlns='uri:falcon:feed:0.1' name='output' description='output'>
  <tags>output=output</tags>
  <groups>output</groups>
  <frequency>hours(1)</frequency>
  <timezone>UTC</timezone>
  <clusters>
    <cluster name='primaryCluster' type='source'>
      <validity start='2016-06-07T17:03Z' end='2016-06-08T16:39Z'/>
      <retention limit='months(1)' action='delete'/>
      <locations>
        <location type='data'>
        </location>
        <location type='stats'>
        </location>
        <location type='meta'>
        </location>
      </locations>
    </cluster>
  </clusters>
  <locations>
    <location type='data' path='/user/ambari-qa/falcon/filtered/${YEAR}-${MONTH}-${DAY}-${HOUR}'>
    </location>
    <location type='stats' path='/'>
    </location>
    <location type='meta' path='/'>
    </location>
  </locations>
  <ACL owner='ambari-qa' group='users' permission='0755'/>
  <schema location='/none' provider='/none'/>
  <properties>
    <property name='jobPriority' value='NORMAL'>
    </property>
    <property name='timeout' value='minutes(3)'>
    </property>
  </properties>
</feed>

Process xml:

<process xmlns='uri:falcon:process:0.1' name='process'> 
<tags>process=process</tags> 
<clusters> 
  <cluster name='primaryCluster'> 
    <validity start='2016-06-07T17:05Z' end='2016-06-08T17:16Z'/> 
  </cluster> 
</clusters> 
<parallel>1</parallel> 
<order>FIFO</order> 
<frequency>hours(1)</frequency> 
<timezone>UTC</timezone> 
<inputs> 
  <input name='input' feed='input' start='now(0,0)' end='now(0,0)'> 
</input> 
</inputs> 
<outputs> 
  <output name='output' feed='output' instance='now(0,0)'> 
</output> 
</outputs> 
<workflow engine='pig' path='/user/ambari-qa/falcon/pig/ufForn.pig'/> 
<retry policy='periodic' delay='minutes(30)' attempts='3'/> 
<ACL owner='ambari-qa' group='users' permission='0755'/> 
</process>

Pig script:

A = load '$input' using PigStorage(';'); 
B = filter A by (chararray) $2 != '-9'; 
store B into '$output' USING PigStorage(';');

Does someone have an idea of what could be happening? Could it be a configuration problem?

3 REPLIES 3

Re: [FALCON] Pipeline is failing

Rising Star

@Misael Castro : If the process is not ending, here are a few things you should check.

1. What state is the process instance in Oozie. If it is in "WAITING", this is probably because you do not have sufficient containers/memory in Yarn to run the job. Please configure yarn accordingly.

2. If the process instance is in RUNNING state, please look into Yarn logs and see why the Pig script is not completing.

Let me know if it is 1 or 2, and I can try to help you with configuration.

Highlighted

Re: [FALCON] Pipeline is failing

New Contributor

Balu : Thank you for your reply. So, the process instance is in Running state. I looked at some logs and I found it: yarn-yarn-nodemanager-splendait.bigdata.log:

2016-06-08 15:01:14,243 INFO  localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://splendait.bigdata:8020/user/ambari-qa/falcon/pig/ufForn.pig(->/hadoop/yarn/local/usercache/ambari-qa/filecache/15/ufForn.pig) transitioned from DOWNLOADING to LOCALIZED


2016-06-08 15:01:14,243 INFO  localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://splendait.bigdata:8020/user/ambari-qa/falcon/pig/ufForn.pig(->/hadoop/yarn/local/usercache/ambari-qa/filecache/15/ufForn.pig) transitioned from DOWNLOADING to LOCALIZED
2016-06-08 15:01:14,270 INFO  localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://splendait.bigdata:8020/user/ambari-qa/.staging/job_1465406660947_0005/job.split(->/hadoop/yarn/local/usercache/ambari-qa/appcache/application_1465406660947_0005/filecache/10/job.split) transitioned from DOWNLOADING to LOCALIZED
2016-06-08 15:01:14,294 INFO  localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://splendait.bigdata:8020/user/ambari-qa/.staging/job_1465406660947_0005/job.xml(->/hadoop/yarn/local/usercache/ambari-qa/appcache/application_1465406660947_0005/filecache/11/job.xml) transitioned from DOWNLOADING to LOCALIZED
2016-06-08 15:01:14,319 INFO  localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://splendait.bigdata:8020/user/ambari-qa/.staging/job_1465406660947_0005/job.splitmetainfo(->/hadoop/yarn/local/usercache/ambari-qa/appcache/application_1465406660947_0005/filecache/12/job.splitmetainfo) transitioned from DOWNLOADING to LOCALIZED
2016-06-08 15:01:14,320 INFO  container.ContainerImpl (ContainerImpl.java:handle(1131)) - Container container_e34_1465406660947_0005_01_000001 transitioned from LOCALIZING to LOCALIZED
2016-06-08 15:01:14,347 INFO  container.ContainerImpl (ContainerImpl.java:handle(1131)) - Container container_e34_1465406660947_0005_01_000001 transitioned from LOCALIZED to RUNNING
2016-06-08 15:01:14,351 INFO  nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:buildCommandExecutor(268)) - launchContainer: [bash, /hadoop/yarn/local/usercache/ambari-qa/appcache/application_1465406660947_0005/container_e34_1465406660947_0005_01_000001/default_container_executor.sh]
2016-06-08 15:01:15,153 INFO  monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(375)) - Starting resource-monitoring for container_e34_1465406660947_0005_01_000001
2016-06-08 15:01:15,180 INFO  monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(464)) - Memory usage of ProcessTree 30203 for container-id container_e34_1465406660947_0005_01_000001: 75.6 MB of 1 GB physical memory used; 2.5 GB of 2.1 GB virtual memory used

yarn-yarn-resourcemanager-splendait.bigdata.log:

2016-06-08 15:01:32,989 INFO  resourcemanager.ClientRMService (ClientRMService.java:getNewApplicationId(290)) - Allocated new applicationId: 6
2016-06-08 15:01:34,826 INFO  resourcemanager.ClientRMService (ClientRMService.java:submitApplication(585)) - Application with id 6 submitted by user ambari-qa
2016-06-08 15:01:34,827 INFO  resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(170)) - USER=ambari-qa	IP=192.168.0.118	OPERATION=Submit Application Request	TARGET=ClientRMService	RESULT=SUCCESS	APPID=application_1465406660947_0006	CALLERCONTEXT=PIG-ufForn.pig-ad53b9bc-2eea-422b-bd62-d8c3056f28c0
2016-06-08 15:01:34,827 INFO  rmapp.RMAppImpl (RMAppImpl.java:transition(1033)) - Storing application with id application_1465406660947_0006
2016-06-08 15:01:34,828 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(764)) - application_1465406660947_0006 State change from NEW to NEW_SAVING
2016-06-08 15:01:34,828 INFO  recovery.RMStateStore (RMStateStore.java:transition(195)) - Storing info for app: application_1465406660947_0006
2016-06-08 15:01:34,855 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(764)) - application_1465406660947_0006 State change from NEW_SAVING to SUBMITTED
2016-06-08 15:01:34,855 INFO  capacity.ParentQueue (ParentQueue.java:addApplication(347)) - Application added - appId: application_1465406660947_0006 user: ambari-qa leaf-queue of parent: root #applications: 2
2016-06-08 15:01:34,855 INFO  capacity.CapacityScheduler (CapacityScheduler.java:addApplication(828)) - Accepted application application_1465406660947_0006 from user: ambari-qa, in queue: default
2016-06-08 15:01:34,855 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(764)) - application_1465406660947_0006 State change from SUBMITTED to ACCEPTED
2016-06-08 15:01:34,856 INFO  resourcemanager.ApplicationMasterService (ApplicationMasterService.java:registerAppAttempt(678)) - Registering app attempt : appattempt_1465406660947_0006_000001
2016-06-08 15:01:34,856 INFO  attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(791)) - appattempt_1465406660947_0006_000001 State change from NEW to SUBMITTED
2016-06-08 15:01:34,856 INFO  capacity.LeafQueue (LeafQueue.java:activateApplications(629)) - not starting application as amIfStarted exceeds amLimit
2016-06-08 15:01:34,856 INFO  capacity.LeafQueue (LeafQueue.java:addApplicationAttempt(678)) - Application added - appId: application_1465406660947_0006 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@53b0b319, leaf-queue: default #user-pending-applications: 1 #user-active-applications: 1 #queue-pending-applications: 1 #queue-active-applications: 1
2016-06-08 15:01:34,856 INFO  capacity.CapacityScheduler (CapacityScheduler.java:addApplicationAttempt(857)) - Added Application Attempt appattempt_1465406660947_0006_000001 to scheduler from user ambari-qa in queue default
2016-06-08 15:01:34,857 INFO  attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(791)) - appattempt_1465406660947_0006_000001 State change from SUBMITTED to SCHEDULED
2016-06-08 15:02:16,439 WARN  timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:putMetrics(212)) - Unable to send metrics to collector by address:http://splendait.bigdata:6188/ws/v1/timeline/metrics

It looks like to be a memory issue.... Do I need to increase the Yarn containers memory?

Highlighted

Re: [FALCON] Pipeline is failing

New Contributor

Caused by: E1310 : E1310: Bundle Job submission Error

Hi Balu,

I have tried example on mirroring sets http://saptak.in/writing/2015/08/11/mirroring-datasets-hadoop-clusters-apache-falcon/, still its giving same issue, please look into the error log falconerrorlog.txt and help me out.

5286-capture1.png

Thanks in advance.

capture1.png (6.6 kB) falconerrorlog.txt (267.6 kB)

Don't have an account?
Coming from Hortonworks? Activate your account here