Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

Is it possible to use S3 for Falcon feeds?


I have not seen any example of using s3 in Falcon except for mirroring. Is it possible to use an S3-bucket as location path for a feed?



@Liam Murphy: Please find the details below

1> Ensure that you have an Account with Amazon S3 and a designated bucket for your data

2> You must have an Access Key ID and a Secret Key

3> Configure HDFS for S3 storage by making the following changes to core-site.xml



<value> YOUR_S3_SECRET_KEY </value>

4>In the falcon feed.xml, specify the Amazon S3 location and schedule the feed

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<feed name="S3Replication" description="S3-Replication" xmlns="uri:falcon:feed:0.1">    
<cluster name="cluster1" type="source">            
<validity start="2016-09-01T00:00Z" end="2034-12-20T08:00Z"/>            
<retention limit="days(24)" action="delete"/>       
<cluster name="cluster2" type="target">            
<validity start="2016-09-01T00:00Z" end="2034-12-20T08:00Z"/>           
<retention limit="days(90)" action="delete"/>            
<location type="data" path="s3://<bucket-name>/<path-folder>/${YEAR}-${MONTH}-${DAY}-${HOUR}/"/>            

View solution in original post



Hi Sowmya,

Is there another debug information I can provide to help solve the cause of the problem?

Kind Regards,



@Liam Murphy: In Oozie log I can see that replication paths don't exist. Can you make sure files exist ?

Eviction fails because of credentials issue. Can you make sure core-site and hdfs-site has the required configs and restart the services and resubmit the feed? Thanks!

2016-09-09 14:44:43,680  INFO CoordActionInputCheckXCommand:520 - SERVER[] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000058-160909120521096-oozie-oozi-C] ACTION[0000058-160909120521096-oozie-oozi-C@10] [0000058-160909120521096-oozie-oozi-C@10]::ActionInputCheck:: File:hftp://, Exists? :false
2016-09-09 14:44:43,817  INFO CoordActionInputCheckXCommand:520 - SERVER[] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000058-160909120521096-oozie-oozi-C] ACTION[0000058-160909120521096-oozie-oozi-C@11] [0000058-160909120521096-oozie-oozi-C@11]::CoordActionInputCheck:: Missing deps:hftp:// 
2016-09-09 14:44:43,818  INFO CoordActionInputCheckXCommand:520 - SERVER[] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000058-160909120521096-oozie-oozi-C] ACTION[0000058-160909120521096-oozie-oozi-C@11] [0000058-160909120521096-oozie-oozi-C@11]::ActionInputCheck:: In checkListOfPaths: hftp:// is Missing.


I just noticed that when a path does not exist for a given hour falcon/oozie just get stuck!.. rather than check for the next hour? My misunderstanding I guess. Have got it working now.


Hi Team / @Sowmya Ramesh, I am trying to use falcon to replicate HDFS to S3. I have tried above steps and I see the HDFStoS3 replication Job status KILLED after running the workflow. After launching Oozie, I can see the workflow changing status from RUNNING to KILLED. Is there a way to troubleshoot. I can run hadoop fs -ls commands on my s3 bucket so definitely got access. I suspect its the s3 URL. I tried downloading the xml changing the URL without the and uploading with no luck. Any other suggestions. Appreciate all your help/support in advance. Regards
