Created 05-12-2016 11:20 AM
Team:
I have noticed one annoying behavior of falcon,Actually for a job in hdp2.2 falcon was retrying to run jobs 10 times but in new current version it is just failing after 1 try.
So I just want to know whether its a change in hdp2.3 or some default behavior which we can change accordingly ?
Thanks in advance.
Created 05-30-2016 11:29 AM
I found the solution for this issue. Actually Before upgrade the value for "oozie.wf.rerun.failnodes" was "false". But after upgrade to HDP-2.3.4, value for "oozie.wf.rerun.failnodes" is "true",so that only failed action node in Oozie workflow instance run thus to prevent the rerun of successful action in Oozie.
it is required to set following property in properties section in Process entity. <property name="oozie.wf.rerun.failnodes" value="false"/>
Created 05-12-2016 11:46 AM
Falcon has the following parameter that can be set. The retry policy.
<retry policy="exp-backoff" delay="hours(1)" attempts="1"/>
https://falcon.apache.org/EntitySpecification.html
Search for Retry
Created 05-12-2016 12:05 PM
@Benjamin Leonhardi: I have following parameter in my proces.xml
<retrypolicy="periodic"delay="minutes(30)"attempts="10"/>
and following error in logs, so don't see anywhere where it is retrying 10 times.
2016-05-07 12:12:25,037 INFO - [RetryHandler:] ~ {Action:retry-instance-failed, Dimensions:{run-id=0, wf-id=0013624-160421060930490-oozie-oozi-W, nominal-name=2016-05-07T15:30Z, wf-user=hdpbatch, entity-type=PROCESS, error-message=Rerun file deleted or renamed for process-instance:, entity-name=hdp0186h-sitecatalyst-kpis-generation-android-events-hourly-process}, Status: SUCCEEDED, Time-taken:4952 ns} (METRIC:38)
2016-05-07 12:12:25,037
Created 05-12-2016 06:48 PM
As @Benjamin Leonhardi specified Falcon should honor the retry attempts in Retry policy. If its not working as expected please create a support issue. Thanks!
Created 05-13-2016 05:47 AM
@Sowmya Ramesh: Yes, I have opened a case and wokring with HW team.
Created 05-12-2016 12:08 PM
I had in what I send you and it seemed to work. I would open a support case.
Created 05-13-2016 05:48 AM
Thanks @Benjamin Leonhardi
Created 05-30-2016 11:29 AM
I found the solution for this issue. Actually Before upgrade the value for "oozie.wf.rerun.failnodes" was "false". But after upgrade to HDP-2.3.4, value for "oozie.wf.rerun.failnodes" is "true",so that only failed action node in Oozie workflow instance run thus to prevent the rerun of successful action in Oozie.
it is required to set following property in properties section in Process entity. <property name="oozie.wf.rerun.failnodes" value="false"/>