Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Who agreed with this solution

avatar
Mentor

@dvohra wrote:

This isn't true. Depending on what you're doing with Oozie, S3 is supported just fine as an input or output location.

 

Doesn't the coordinator expect the input path to be on HDFS as hdfs://{nameNode} is prepended automatically? The workflow.xml is on the HDFS? Isn't the workflow.xml required to be on the HDFS?


Yes unfortunately coordinators currently poll inputs over HDFS alone, which is a limitation. However, writing simple WF actions to work over S3 is still possible.

 

Yes, WFs should reside on HDFS, as Oozie views it as its central DFS. Similar to how MR requires a proper DFS to run. But this shouldn't impair simple I/O operations done over an external FS such as S3.

 

I think Romain has covered the relevant JIRAs for tracking removal of this limitation.

View solution in original post

Who agreed with this solution