Support Questions

Find answers, ask questions, and share your expertise

I am working with Falcon, while creating the entity for cluster i got the error for the location of staging directory does not exists. here is my xml file

avatar
Rising Star
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><clustername="primaryCluster"description="this is primary cluster"colo="primaryColo"xmlns="uri:falcon:cluster:0.1"><tags>primaryKey=primaryValue</tags><interfaces><interfacetype="readonly"endpoint="hftp://sandbox.hortonworks.com:50070"version="2.2.0"/><interfacetype="write"endpoint="hdfs://sandbox.hortonworks.com:8020"version="2.2.0"/><interfacetype="execute"endpoint="sandbox.hortonworks.com:8050"version="2.2.0"/><interfacetype="workflow"endpoint="http://sandbox.hortonworks.com:11000/oozie/"version="4.0.0"/><interfacetype="messaging"endpoint="tcp://sandbox.hortonworks.com:61616?daemon=true"version="5.1.6"/></interfaces><locations><locationname="staging"path="/apps/falcon/primaryCluster/staging"/><locationname="temp"path="/tmp"/><locationname="working"path="/apps/falcon/primaryCluster/working"/></locations><ACLowner="ambari-qa"group="users"permission="0x755"/><properties><propertyname="test"value="value1"/></properties></cluster>
1 ACCEPTED SOLUTION

avatar

Before creating the cluster entities, we need to create the directories on HDFS representing the cluster that we are going to define, namely primaryCluster in your case.

su - falcon

hadoop fs -mkdir /apps/falcon/primaryCluster

Further create directories called staging and working

hadoop fs -mkdir /apps/falcon/primaryCluster/staging

hadoop fs -mkdir /apps/falcon/primaryCluster/working

Finally you need to set the proper permissions on the staging/working directories:

hadoop fs -chmod 777 /apps/falcon/primaryCluster/staging

hadoop fs -chmod 755 /apps/falcon/primaryCluster/working

hadoop fs -chown -R falcon /apps/falcon/*

You can refer to http://hortonworks.com/hadoop-tutorial/processing-data-pipeline-with-apache-falcon/ for more details.

View solution in original post

7 REPLIES 7

avatar

Before creating the cluster entities, we need to create the directories on HDFS representing the cluster that we are going to define, namely primaryCluster in your case.

su - falcon

hadoop fs -mkdir /apps/falcon/primaryCluster

Further create directories called staging and working

hadoop fs -mkdir /apps/falcon/primaryCluster/staging

hadoop fs -mkdir /apps/falcon/primaryCluster/working

Finally you need to set the proper permissions on the staging/working directories:

hadoop fs -chmod 777 /apps/falcon/primaryCluster/staging

hadoop fs -chmod 755 /apps/falcon/primaryCluster/working

hadoop fs -chown -R falcon /apps/falcon/*

You can refer to http://hortonworks.com/hadoop-tutorial/processing-data-pipeline-with-apache-falcon/ for more details.

avatar
Rising Star

i have done that. but its still showing the same error.

avatar
Rising Star

i have done that before the cluster entity.. but its still showing the same error

avatar

@khushi kalra: Hard to figure out what's going wrong without looking at logs. Can you tail & provide falcon.application.log when the error occurs? Also can you paste the output for

hadoop fs -ls -R /apps/falcon? Thanks!

avatar
Rising Star

thank you its working now.

But i want to ask about the validity start date and end date of the feeds and process?

avatar
Expert Contributor

Just out of curiosity, how did you get this working, I am working through the same problem on sandbox 2.4, log attachedfalcon-app-log.txt

[falcon@sandbox logs]$ hadoop fs -ls -R /apps/falcon drwxrwxrwx - falcon hdfs 0 2016-03-30 14:56 /apps/falcon/backupCluster drwxrwxrwx - falcon hdfs 0 2016-03-30 14:54 /apps/falcon/backupCluster/staging drwxr-xr-x - falcon hdfs 0 2016-03-30 14:56 /apps/falcon/backupCluster/working drwxrwxrwx - falcon hdfs 0 2016-03-30 14:55 /apps/falcon/primaryCluster drwxrwxrwx - falcon hdfs 0 2016-03-30 14:53 /apps/falcon/primaryCluster/staging drwxr-xr-x - falcon hdfs 0 2016-03-30 14:55 /apps/falcon/primaryCluster/working

avatar

khushi kalra:

Validity of a feed on cluster specifies duration for which this feed is valid on this cluster.

Process validity defines how long the workflow should run. It has 3 components - start time, end time and timezone. Start time and end time are timestamps defined in yyyy-MM-dd'T'HH:mm'Z' format and should always be in UTC. Timezone is used to compute the next instances starting from start time. The workflow will start at start time and end before end time specified on a given cluster.

Please refer this doc for more details. Thanks!