Oozie "File Action" (move files) fails when files ...
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.
Hi all, I have this File Action (part of an Oozie Workflow) that runs every 15 minutes, and moves all the files in a "receiving" HDFS directory, putting them into an "in_process" HDFS directory.
Everything is OK unless for whatever reason the number of files in the "receiving" HDFS directory grows too much. If, let's say, that number gets to be > 30000 (not exactly, but around that number) the File Action fails, without meaningful errors in the logs
I'd need help to sort out a few possible options:
- Is it possible to manually specify the maximum number of files a File Action would be able to handle? E.g. setting more resources in whatever parameter, or specifying an explicit value somewhere?
- If I use a Shell Action instead, would it possibly be a better choice? I'm a bit hesitant because I'm afraid that, given that the "live files" getting copied in the "receiving" HDFS directory are coming in at a very high rate, the Shell Action could not be able to keep up with the changes inside the directory...
Looking forward to receiving your insights, thanks a lot!