Member since
07-25-2018
174
Posts
29
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5414 | 03-19-2020 03:18 AM | |
3457 | 01-31-2020 01:08 AM | |
1335 | 01-30-2020 05:45 AM | |
2586 | 06-01-2016 12:56 PM | |
3073 | 05-23-2016 08:46 AM |
05-23-2016
08:46 AM
1 Kudo
Hello guys, I have resolved this problem.Actually There was problem in hive.hql file Here is Correct Hive script: LOAD DATA INPATH '${input}/employee.txt' OVERWRITE INTO TABLE temp.employee PARTITION(${falcon_output_partitions_hive});
... View more
05-22-2016
07:33 AM
Thank you Kuldeep, As I am new to falcon and oozie so i dont know where to see Oozie launcher logs? At which location?
... View more
05-22-2016
07:31 AM
Thanks Sri, I have checked yarn logs too but about which error log file you are talking? I mean,oozie error log file/falcon/hive log file? About whichever log file you are talking,please mention the log file name so that I can immediately send it you.
... View more
05-21-2016
07:49 AM
1 Kudo
Hey guys, I am badly stuck at running falcon job and trying to solve this since two days.I have designed data pipeline which loads data on hourly basis from HDFS loaction and put it in hive table(table is partitioned already). The Error which I am getting in OOzie UI for falcon process is : Main class [org.apache.oozie.action.hadoop.HiveMain], exit code [40000]. I have checked the oozie launcher logs and yarn logs too but some wrong configuration I didnt able to see yarn logs. What to do in such a case? Where I should look for the error/logs? Where I will get to see complete or string reason of error? Command: yarn logs -applicationId <ID> --------------------------------------------------------------------------------------------------------------------------------------------------------------- There are total 3 XML 1)InputFeedhive.xml(Which refer to HDFS location) 2)processHive.xml(Runs hive script to load data into hive table) 3)hiveOutputFeed.xml(Refer to hive table.) --------------------------------------------------------------------------------------------------------------------------------------------------------------- The hive.hql conatin only: LOAD DATA INPATH "${input}" OVERWRITE INTO TABLE "${falcon_output_table}" PARTITION(${falcon_output_partitions_hive}); --------------------------------------------------------------------------------------------------------------------------------------------------------------- The HDFS Loaction is in following format: /user/manoj/input/bdate=2016-05-21-09/employee.txt --------------------------------------------------------------------------------------------------------------------------------------------------------------- A emloyee.txt contains 12,john,304 13,brown,305 14,jeddy,306 --------------------------------------------------------------------------------------------------------------------------------------------------------------- 1) InputFeedhive.xml <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <feed name="inputFeedHive" xmlns="uri:falcon:feed:0.1"> <frequency>hours(1)</frequency> <timezone>UTC</timezone>
<clusters>
<cluster name="hiveCluster" type="source"> <validity start="2016-05-21T07:00Z" end="2016-05-25T07:00Z"/>
<retention limit="days(1)" action="delete"/>
</cluster> </clusters> <locations>
<location type="data" path="/user/manoj/input/bdate=${YEAR}-${MONTH}-${DAY}-${HOUR}"/> <location type="stats" path="/user/manoj/statsPath${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
<location type="meta" path="/user/manoj/metaPath${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
</locations>
<ACL owner="ambari-qa" group="users" permission="0755"/>
<schema location="/none" provider="none"/>
<properties> <property name="jobPriority" value="VERY_HIGH"/>
</properties>
</feed> --------------------------------------------------------------------------------------------------------------------------------------------------------------- 2) hiveOutputFeed.xml <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <feed name="hiveOutputFeed" xmlns="uri:falcon:feed:0.1">
<frequency>hours(1)</frequency>
<timezone>UTC</timezone> <clusters>
<cluster name="hiveCluster" type="source">
<validity start="2016-05-21T07:00Z" end="2016-05-25T07:00Z"/>
<retention limit="days(1)" action="delete"/>
<table uri="catalog:temp:employee#bdate=${YEAR}-${MONTH}-${DAY}-${HOUR}"/> </cluster>
</clusters> <table uri="catalog:temp:employee#bdate=${YEAR}-${MONTH}-${DAY}-${HOUR}"/>
<ACL owner="ambari-qa" group="users" permission="0755"/> <schema location="hcat" provider="hcat"/>
</feed>
--------------------------------------------------------------------------------------------------------------------------------------------------------------- 3) processHive.xml <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<process name="processHive" xmlns="uri:falcon:process:0.1"> <tags>new=HiveProcess</tags>
<clusters>
<cluster name="hiveCluster">
<validity start="2016-05-21T07:00Z" end="2016-05-25T07:00Z"/>
</cluster>
</clusters>
<parallel>1</parallel>
<order>FIFO</order>
<frequency>minutes(15)</frequency> <timezone>UTC</timezone>
<inputs> <input name="input" feed="inputFeedHive" start="now(0,0)" end="now(0,0)"/> </inputs>
<outputs>
<output name="output" feed="hiveOutputFeed" instance="now(0,0)"/>
</outputs> <workflow engine="hive" path="/user/manoj/script/hive.hql"/> <retry policy="periodic" delay="minutes(30)" attempts="1"/> <ACL owner="ambari-qa" group="users" permission="0755"/>
</process> --------------------------------------------------------------------------------------------------------------------------------------------------------------- Thanks in advance. Please share you ides with me.
... View more
Labels:
- Labels:
-
Apache Falcon
-
Apache Oozie
05-20-2016
06:25 PM
Thank you Klaus It's working for me.
... View more
05-20-2016
06:49 AM
1 Kudo
I am designed pipeline to load the data from HDFS loaction to Hive partition table.I have created two Feed xml files and one process xml using falcon .The one feed provide the input to falcon process entity(i.e. HDFS location where data actually resides) and another which store the data into hive table(which is present in hive and partitioned). The Feeds are succeeded thought out the pipeline but falcon process is failing with error like ERROR:- org.apache.oozie.action.ActionExecutorException: IllegalArgumentException: Can not create a Path from an empty string
at org.apache.oozie.action.ActionExecutor.convertException(ActionExecutor.java:445) I can able to see feeds are running successfully but there is something problem with process entity and one thing more,I am scheduled to run one hive script to achieve this whole pipeline. Hive.hql:- LOAD DATA INPATH '$input' OVERWRITE INTO TABLE "${falcon_output_table}" PARTITION(${falcon_output_partitions_hive}); Here I don't understand why oozie is throwing such a error,whether it's hive error or oozie error? Thank in advance, Please share yous ideas with me regarding this issue.
... View more
Labels:
- Labels:
-
Apache Falcon
-
Apache Hive
-
Apache Oozie
05-18-2016
11:54 AM
Hello, Suppose,I performed some operations on Hive using falcon and I have designed end to end pipeline for Hive within Falcon. so My questions are: Will I be able to see lineage for Falcon within atlas? Again I want to tell you that am using Atlas (Version 0.5). I am sure that Altas will provide lineage for the actions performed within Hive(operation such as CREATE TABLE/SELECT etc) but Will it capture metadata for Falcon too.
... View more
Labels:
- Labels:
-
Apache Atlas
05-14-2016
04:27 AM
Thank you drussell.
... View more
05-14-2016
04:21 AM
Thank you Erik,
... View more
05-13-2016
03:16 PM
2 Kudos
I have seen the lineage for hive tables in apache atlas UI but when I search for particular column in atlas search box,The atlas UI does not providing me lineage for it(i.e, column).
... View more
Labels:
- Labels:
-
Apache Atlas
-
Apache Hive
- « Previous
- Next »