Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

falcon process to insert values in a join table is running but not generating any output in the table.

falcon process to insert values in a join table is running but not generating any output in the table.

New Contributor

I am scheduling a falcon job, input feed are two hive tables with feed date as partition, output feed is a join table of the two tables with feed date as partition. The job is scheduled, it shows status as running, but fails to update the table with new values.

ClusterDefinition:-

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<cluster name="primaryClusterPoc2" description="This is a primary Cluster" colo="primaryColo" xmlns="uri:falcon:cluster:0.1">
    <tags>EntityType=Cluster</tags>
    <interfaces>
        <interface type="readonly" endpoint="hftp://sandbox.hortonworks.com:50070" version="2.2.0"/>
        <interface type="write" endpoint="hdfs://sandbox.hortonworks.com:8020" version="2.2.0"/>
        <interface type="execute" endpoint="sandbox.hortonworks.com:8050" version="2.2.0"/>
        <interface type="workflow" endpoint="http://sandbox.hortonworks.com:11000/oozie/" version="4.0.0"/>
        <interface type="messaging" endpoint="tcp://sandbox.hortonworks.com:61616?daemon=true" version="5.1.6"/>
        <interface type="registry" endpoint="thrift://sandbox.hortonworks.com:9083" version="0.11.0"/>
    </interfaces>
    <locations>
        <location name="staging" path="/apps/falcon/primaryCluster/staging"/>
        <location name="temp" path="/tmp"/>
        <location name="working" path="/apps/falcon/primaryCluster/working"/>
    </locations>
    <ACL owner="ambari-qa" group="users" permission="0755"/>
</cluster>

inputFeed1 :-

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<feed name="inputFeed1" description="hello hive table feed" xmlns="uri:falcon:feed:0.1">
    <tags>EntityType=Feed</tags>
    <frequency>minutes(10)</frequency>
    <timezone>GMT+02:00</timezone>
    <late-arrival cut-off="minutes(2)"/>
    <clusters>
        <cluster name="primaryClusterPoc2" type="source">
            <validity start="2016-09-14T18:31Z" end="2016-09-17T18:45Z"/>
            <retention limit="hours(1)" action="delete"/>
            <table uri="catalog:d0_stg_pudb:hellohivetable#ds=${YEAR}-${MONTH}-${DAY}"/>
        </cluster>
    </clusters>
    <table uri="catalog:d0_stg_pudb:hellohivetable#ds=${YEAR}-${MONTH}-${DAY}"/>
    <ACL owner="id847257" group="hdfs" permission="0755"/>
    <schema location="hcat" provider="hcat"/>
    <properties/>
</feed>

inputFeed2 :-

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<feed name="inputFeed2" description="world hive table feed" xmlns="uri:falcon:feed:0.1">
    <tags>EntityType=Feed</tags>
    <frequency>minutes(10)</frequency>
    <timezone>GMT+02:00</timezone>
    <late-arrival cut-off="minutes(2)"/>
    <clusters>
        <cluster name="primaryClusterPoc2" type="source">
            <validity start="2016-09-14T18:31Z" end="2016-09-17T18:45Z"/>
            <retention limit="hours(1)" action="delete"/>
            <table uri="catalog:d0_stg_pudb:worldhivetable#ds=${YEAR}-${MONTH}-${DAY}"/>
        </cluster>
    </clusters>
    <table uri="catalog:d0_stg_pudb:worldhivetable#ds=${YEAR}-${MONTH}-${DAY}"/>
    <ACL owner="id847257" group="hdfs" permission="0755"/>
    <schema location="hcat" provider="hcat"/>
    <properties/>
</feed>

Process :-

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<process name="tryProcess" xmlns="uri:falcon:process:0.1">
    <tags>EntityType=Process</tags>
    <clusters>
        <cluster name="primaryClusterPoc2">
            <validity start="2016-09-14T18:31Z" end="2016-09-17T18:45Z"/>
        </cluster>
    </clusters>
    <parallel>1</parallel>
    <order>FIFO</order>
    <frequency>hours(1)</frequency>
    <timezone>GMT+02:00</timezone>
    <inputs>
        <input name="inputHello" feed="inputFeed1" start="now(0,0)" end="now(24,0)"/>
        <input name="inputWorld" feed="inputFeed2" start="now(0,0)" end="now(24,0)"/>
    </inputs>
    <outputs>
        <output name="outputWorld" feed="outputFeed" instance="now(0,0)"/>
    </outputs>
    <workflow name="hiveJoinWorkflow" version="hive-0.13.1" engine="hive" path="/user/id847257/proposal2/helloWorldHive.hql"/>
    <retry policy="exp-backoff" delay="minutes(3)" attempts="1"/>
    <ACL owner="id847257" group="hdfs" permission="0755"/>
</process>

The hql script run_hive_query.hql I am using is below :-

use d0_stg_pudb;

INSERT OVERWRITE TABLE helloworldhivetable PARTITION (ds) SELECT helloHiveTable.name as a, worldHiveTable.name as b, DATE_ADD(TO_DATE(from_unixtime(UNIX_TIMESTAMP())),5) as ds from helloHiveTable, worldHiveTable where helloHiveTable.id=worldHiveTable.id
1 REPLY 1

Re: falcon process to insert values in a join table is running but not generating any output in the table.

Expert Contributor

@Aditi Kumari

Have you created the output feed entity as well, can you share that.

You have mentioned that scheduled job fails to update the table with new values, has any error occurred. For this you can check the launched workflow on Oozie UI and corresponding Hadoop jobs on RM UI.

Don't have an account?
Coming from Hortonworks? Activate your account here