Created 06-08-2016 12:50 PM
Hi guys,
I am having trouble with this Falcon pipeline.
The Feeds are running well, but the Process never ends and when it is running it is consuming too much Yarn memory, but it is a little weird cause the Process is running a simple pig script.
Feeds xml:
<feed xmlns='uri:falcon:feed:0.1' name='input' description='input'> <tags>input=input</tags> <groups>input</groups> <frequency>hours(1)</frequency> <timezone>UTC</timezone> <clusters> <cluster name='primaryCluster' type='source'> <validity start='2016-06-07T17:00Z' end='2016-06-08T16:35Z'/> <retention limit='months(1)' action='delete'/> <locations> <location type='data'> </location> <location type='stats'> </location> <location type='meta'> </location> </locations> </cluster> </clusters> <locations> <location type='data' path='/user/ambari-qa/falcon/input/${YEAR}-${MONTH}-${DAY}-${HOUR}'> </location> <location type='stats' path='/'> </location> <location type='meta' path='/'> </location> </locations> <ACL owner='ambari-qa' group='users' permission='0755'/> <schema location='/none' provider='/none'/> <properties> <property name='jobPriority' value='NORMAL'> </property> <property name='timeout' value='minutes(3)'> </property> </properties> </feed>
<feed xmlns='uri:falcon:feed:0.1' name='output' description='output'> <tags>output=output</tags> <groups>output</groups> <frequency>hours(1)</frequency> <timezone>UTC</timezone> <clusters> <cluster name='primaryCluster' type='source'> <validity start='2016-06-07T17:03Z' end='2016-06-08T16:39Z'/> <retention limit='months(1)' action='delete'/> <locations> <location type='data'> </location> <location type='stats'> </location> <location type='meta'> </location> </locations> </cluster> </clusters> <locations> <location type='data' path='/user/ambari-qa/falcon/filtered/${YEAR}-${MONTH}-${DAY}-${HOUR}'> </location> <location type='stats' path='/'> </location> <location type='meta' path='/'> </location> </locations> <ACL owner='ambari-qa' group='users' permission='0755'/> <schema location='/none' provider='/none'/> <properties> <property name='jobPriority' value='NORMAL'> </property> <property name='timeout' value='minutes(3)'> </property> </properties> </feed>
Process xml:
<process xmlns='uri:falcon:process:0.1' name='process'> <tags>process=process</tags> <clusters> <cluster name='primaryCluster'> <validity start='2016-06-07T17:05Z' end='2016-06-08T17:16Z'/> </cluster> </clusters> <parallel>1</parallel> <order>FIFO</order> <frequency>hours(1)</frequency> <timezone>UTC</timezone> <inputs> <input name='input' feed='input' start='now(0,0)' end='now(0,0)'> </input> </inputs> <outputs> <output name='output' feed='output' instance='now(0,0)'> </output> </outputs> <workflow engine='pig' path='/user/ambari-qa/falcon/pig/ufForn.pig'/> <retry policy='periodic' delay='minutes(30)' attempts='3'/> <ACL owner='ambari-qa' group='users' permission='0755'/> </process>
Pig script:
A = load '$input' using PigStorage(';'); B = filter A by (chararray) $2 != '-9'; store B into '$output' USING PigStorage(';');
Does someone have an idea of what could be happening? Could it be a configuration problem?
Created 06-08-2016 02:57 PM
@Misael Castro : If the process is not ending, here are a few things you should check.
1. What state is the process instance in Oozie. If it is in "WAITING", this is probably because you do not have sufficient containers/memory in Yarn to run the job. Please configure yarn accordingly.
2. If the process instance is in RUNNING state, please look into Yarn logs and see why the Pig script is not completing.
Let me know if it is 1 or 2, and I can try to help you with configuration.
Created 06-08-2016 06:52 PM
Balu : Thank you for your reply. So, the process instance is in Running state. I looked at some logs and I found it: yarn-yarn-nodemanager-splendait.bigdata.log:
2016-06-08 15:01:14,243 INFO localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://splendait.bigdata:8020/user/ambari-qa/falcon/pig/ufForn.pig(->/hadoop/yarn/local/usercache/ambari-qa/filecache/15/ufForn.pig) transitioned from DOWNLOADING to LOCALIZED 2016-06-08 15:01:14,243 INFO localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://splendait.bigdata:8020/user/ambari-qa/falcon/pig/ufForn.pig(->/hadoop/yarn/local/usercache/ambari-qa/filecache/15/ufForn.pig) transitioned from DOWNLOADING to LOCALIZED 2016-06-08 15:01:14,270 INFO localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://splendait.bigdata:8020/user/ambari-qa/.staging/job_1465406660947_0005/job.split(->/hadoop/yarn/local/usercache/ambari-qa/appcache/application_1465406660947_0005/filecache/10/job.split) transitioned from DOWNLOADING to LOCALIZED 2016-06-08 15:01:14,294 INFO localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://splendait.bigdata:8020/user/ambari-qa/.staging/job_1465406660947_0005/job.xml(->/hadoop/yarn/local/usercache/ambari-qa/appcache/application_1465406660947_0005/filecache/11/job.xml) transitioned from DOWNLOADING to LOCALIZED 2016-06-08 15:01:14,319 INFO localizer.LocalizedResource (LocalizedResource.java:handle(203)) - Resource hdfs://splendait.bigdata:8020/user/ambari-qa/.staging/job_1465406660947_0005/job.splitmetainfo(->/hadoop/yarn/local/usercache/ambari-qa/appcache/application_1465406660947_0005/filecache/12/job.splitmetainfo) transitioned from DOWNLOADING to LOCALIZED 2016-06-08 15:01:14,320 INFO container.ContainerImpl (ContainerImpl.java:handle(1131)) - Container container_e34_1465406660947_0005_01_000001 transitioned from LOCALIZING to LOCALIZED 2016-06-08 15:01:14,347 INFO container.ContainerImpl (ContainerImpl.java:handle(1131)) - Container container_e34_1465406660947_0005_01_000001 transitioned from LOCALIZED to RUNNING 2016-06-08 15:01:14,351 INFO nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:buildCommandExecutor(268)) - launchContainer: [bash, /hadoop/yarn/local/usercache/ambari-qa/appcache/application_1465406660947_0005/container_e34_1465406660947_0005_01_000001/default_container_executor.sh] 2016-06-08 15:01:15,153 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(375)) - Starting resource-monitoring for container_e34_1465406660947_0005_01_000001 2016-06-08 15:01:15,180 INFO monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(464)) - Memory usage of ProcessTree 30203 for container-id container_e34_1465406660947_0005_01_000001: 75.6 MB of 1 GB physical memory used; 2.5 GB of 2.1 GB virtual memory used
yarn-yarn-resourcemanager-splendait.bigdata.log:
2016-06-08 15:01:32,989 INFO resourcemanager.ClientRMService (ClientRMService.java:getNewApplicationId(290)) - Allocated new applicationId: 6 2016-06-08 15:01:34,826 INFO resourcemanager.ClientRMService (ClientRMService.java:submitApplication(585)) - Application with id 6 submitted by user ambari-qa 2016-06-08 15:01:34,827 INFO resourcemanager.RMAuditLogger (RMAuditLogger.java:logSuccess(170)) - USER=ambari-qa IP=192.168.0.118 OPERATION=Submit Application Request TARGET=ClientRMService RESULT=SUCCESS APPID=application_1465406660947_0006 CALLERCONTEXT=PIG-ufForn.pig-ad53b9bc-2eea-422b-bd62-d8c3056f28c0 2016-06-08 15:01:34,827 INFO rmapp.RMAppImpl (RMAppImpl.java:transition(1033)) - Storing application with id application_1465406660947_0006 2016-06-08 15:01:34,828 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(764)) - application_1465406660947_0006 State change from NEW to NEW_SAVING 2016-06-08 15:01:34,828 INFO recovery.RMStateStore (RMStateStore.java:transition(195)) - Storing info for app: application_1465406660947_0006 2016-06-08 15:01:34,855 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(764)) - application_1465406660947_0006 State change from NEW_SAVING to SUBMITTED 2016-06-08 15:01:34,855 INFO capacity.ParentQueue (ParentQueue.java:addApplication(347)) - Application added - appId: application_1465406660947_0006 user: ambari-qa leaf-queue of parent: root #applications: 2 2016-06-08 15:01:34,855 INFO capacity.CapacityScheduler (CapacityScheduler.java:addApplication(828)) - Accepted application application_1465406660947_0006 from user: ambari-qa, in queue: default 2016-06-08 15:01:34,855 INFO rmapp.RMAppImpl (RMAppImpl.java:handle(764)) - application_1465406660947_0006 State change from SUBMITTED to ACCEPTED 2016-06-08 15:01:34,856 INFO resourcemanager.ApplicationMasterService (ApplicationMasterService.java:registerAppAttempt(678)) - Registering app attempt : appattempt_1465406660947_0006_000001 2016-06-08 15:01:34,856 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(791)) - appattempt_1465406660947_0006_000001 State change from NEW to SUBMITTED 2016-06-08 15:01:34,856 INFO capacity.LeafQueue (LeafQueue.java:activateApplications(629)) - not starting application as amIfStarted exceeds amLimit 2016-06-08 15:01:34,856 INFO capacity.LeafQueue (LeafQueue.java:addApplicationAttempt(678)) - Application added - appId: application_1465406660947_0006 user: org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue$User@53b0b319, leaf-queue: default #user-pending-applications: 1 #user-active-applications: 1 #queue-pending-applications: 1 #queue-active-applications: 1 2016-06-08 15:01:34,856 INFO capacity.CapacityScheduler (CapacityScheduler.java:addApplicationAttempt(857)) - Added Application Attempt appattempt_1465406660947_0006_000001 to scheduler from user ambari-qa in queue default 2016-06-08 15:01:34,857 INFO attempt.RMAppAttemptImpl (RMAppAttemptImpl.java:handle(791)) - appattempt_1465406660947_0006_000001 State change from SUBMITTED to SCHEDULED 2016-06-08 15:02:16,439 WARN timeline.HadoopTimelineMetricsSink (HadoopTimelineMetricsSink.java:putMetrics(212)) - Unable to send metrics to collector by address:http://splendait.bigdata:6188/ws/v1/timeline/metrics
It looks like to be a memory issue.... Do I need to increase the Yarn containers memory?
Created on 06-29-2016 06:09 AM - edited 08-19-2019 02:57 AM
Hi Balu,
I have tried example on mirroring sets http://saptak.in/writing/2015/08/11/mirroring-datasets-hadoop-clusters-apache-falcon/, still its giving same issue, please look into the error log falconerrorlog.txt and help me out.
Thanks in advance.
capture1.png (6.6 kB) falconerrorlog.txt (267.6 kB)