Reply
Highlighted
Explorer
Posts: 18
Registered: ‎08-13-2013
Accepted Solution

Problem in oozie input path . How do I configure the oozie in cloudera manager,input path as s3n.

[ Edited ]

My input path in s3n,

 

s3n://xxx-xxx/20130813/08

 

My oozie configuration show as ,

 

hdfs://xxx.internal:8020/s3n://xxx-xxx/20130813/08

Expert Contributor
Posts: 63
Registered: ‎08-06-2013

Re: Problem in oozie input path . How do I configure the oozie in cloudera manager,input path as s3

Explorer
Posts: 18
Registered: ‎08-13-2013

Re: Problem in oozie input path . How do I configure the oozie in cloudera manager,input path as s3

Sorry, my question is through Hue in cloudera manager  i'm running the oozie job .And I can able to access the hdfs,my question is to connect the another instance  Amazon  as s3n://xxx  to connect ..

Expert Contributor
Posts: 63
Registered: ‎08-06-2013

Re: Problem in oozie input path . How do I configure the oozie in cloudera manager,input path as s3

The input path is required to be to HDFS, not S3. S3 is not the same as HDFS.

Posts: 1,896
Kudos: 433
Solutions: 303
Registered: ‎07-31-2013

Re: Problem in oozie input path . How do I configure the oozie in cloudera manager,input path as s3


@dvohra wrote:

The input path is required to be to HDFS, not S3. S3 is not the same as HDFS.


This isn't true. Depending on what you're doing with Oozie, S3 is supported just fine as an input or output location.

Posts: 1,896
Kudos: 433
Solutions: 303
Registered: ‎07-31-2013

Re: Problem in oozie input path . How do I configure the oozie in cloudera manager,input path as s3


@Ashok wrote:

My input path in s3n,

 

s3n://xxx-xxx/20130813/08

 

My oozie configuration show as ,

 

hdfs://xxx.internal:8020/s3n://xxx-xxx/20130813/08


Can you share your workflow.xml for us to validate?

 

If you're passing an S3 input or output path, simply ensure your workflow does not template it as ${nameNode}/${input} or something like that. That way you're prepending a HDFS URI to your already-an-uri path. This could most likely be your issue.

Explorer
Posts: 18
Registered: ‎08-13-2013

Re: Problem in oozie input path . How do I configure the oozie in cloudera manager,input path as s3

In coordinator jobs i'm passing the dataset  uri template as 

 

s3n://xxx-xxx/${YEAR}${MONTH}${DAY}/${HOUR}

 

and coord:dataOut as 

 

<property>
<name>in_folder</name>
<value>${coord:dataOut('in_folder')}</value>
</property>

 

and my workflow.xml  input as 

 

${in_folder}

 

when I submit the  coordinator job it automatically preappend  the configuration like:

 

${nameNode}s3n://xxx-xxx/${YEAR}${MONTH}${DAY}/${HOUR}

Cloudera Employee
Posts: 723
Registered: ‎07-30-2013

Re: Problem in oozie input path . How do I configure the oozie in cloudera manager,input path as s3

Good to know, Hue Coodinators are prepended only with hdfs.

 

Is https://issues.apache.org/jira/browse/OOZIE-426 finished?

Explorer
Posts: 18
Registered: ‎08-13-2013

Re: Problem in oozie input path . How do I configure the oozie in cloudera manager,input path as s3

FWIW, the same job works fine as a workflow when submitted via Hue. In this case, we manually pass the input (S3) and output (hdfs) locations and the job runs successfuly - thus establishing that the problem is not with S3 support. The problem is when we let the co-ordinator pass this input (via a computed datasource) does it automatically prepend hdfs://{nameNode} in front of the s3n://<> URI. Hope this clarifies.

Cloudera Employee
Posts: 723
Registered: ‎07-30-2013

Re: Problem in oozie input path . How do I configure the oozie in cloudera manager,input path as s3

Ok this clarifies a lot! I updated https://issues.cloudera.org/browse/HUE-1501.

Announcements