We are seeing that some of our falcon jobs are failing when the Namenodes failover. We might have 25 jobs running and any that are in the finishing stages will fail if they are finishing at the same time the failover happens. We don't seem to have this problem with our streaming jobs. Any guidance would be appreciated on why this is happening, what we should check and any changes we could make on platform or application side would be appreciated.
... View more