Member since
07-25-2018
174
Posts
29
Kudos Received
5
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5414 | 03-19-2020 03:18 AM | |
3457 | 01-31-2020 01:08 AM | |
1337 | 01-30-2020 05:45 AM | |
2590 | 06-01-2016 12:56 PM | |
3073 | 05-23-2016 08:46 AM |
06-01-2016
10:51 AM
Thank you Sri, I followed the link which you have sent to me but still getting same error 1) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.openx.data.jsonserde.JsonSerDe not found. Actually query which am running using falcon is, INSERT OVERWRITE TABLE falconexample.Patient_proce PARTITION (${falcon_output_partitions_hive})
select p.id,p.gender, p.Age, p.birthdate, o.component[1].valuequantity.value, o.component[1].valuequantity.unit
from (select *, floor(datediff(to_date(from_unixtime(unix_timestamp())), to_date(birthdate)) / 365.25) as Age FROM falconexample. patient1) p inner join falconexample.DiagnosticReport1 d on p.id = substr(d.subject.reference,9) inner join falconexample.Observation1 o on p.id = substr(o.subject.reference,9)
where p.Age>17 and p.Age<86 and o.component[1].valuequantity.value <140; If I write statement in hive script as "ADD JAR /user/oozie/share/lib/lib_20160503082834/hive/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;" then I get an error like 2) java.lang.IllegalArgumentException: /user/oozie/share/lib/lib_20160503082834/hive/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar does not exist. and if I remove that statement then I get 1st error. Following solutions I have tried to resolve this issue. 1)added serDe jar at oozie share lib folder.(HDFS location) 2)added serDe jar at hive lib(Local FS). 3)added serDe jat at "falcon.libpath" location(i.e,/apps/falcon/pigCluster/working/lib)(HDFS location) 4)added jar at /apps/falcon/pigCluster/staging/falcon/workflows/process/patientDataProcess/6b7edfbfe5bcfc50e0fc845f71cd9122_1464767974691/DEFAULT/lib(HDFS location) every where I have put the jar but again getting 1st error.I dont know what is happening with this stuff? why falcon does not getting that jar? I am posting log file which I found under falcon working direcoty user-action-hive-failed.txt
... View more
06-01-2016
02:30 AM
Thanks cnormile, Could you please tell me,generally at which location we should put that serDe jar so that falcon/oozie will also pick that jar while running the data pipeline. As I already mentioned that jar is present at both the locations(i.e, In user/oozie/share/lib/hive and usr/hdp/hdp-<version>/hive/lib) and restarted already hive and oozie both but stiill prompting same error.
... View more
05-31-2016
05:24 PM
Hello guys, Currently,my requirement is to process data of 3 hive tables and store it's result into another hive table.Therefore,I have created 3 hive uri input feed ,1 hive uri output feed and 1 process which is accepting this 3 feed entity as input and genereting output in output feed.Process entity giving error as: FAILED: RuntimeException MetaException(message:java.lang.ClassNotFoundException Classorg.openx.data.jsonserde.JsonSerDe not found) Actually i am getting this error in oozie when scheduled process entity gets failed I understood that error is so simple and it's caused because of serDe jar is missing somewhere within hive/lib or oozir/share/lib. I tried the following solutions: 1) add a jar to hive lib folder or add it to oozie lib. Jar: json-serde-1.3.8-SNAPSHOT-jar- with-dependencies.jar;. But still getting same error.if i add the jar through hive cli using ADD JAR command then query run smoothly there. My xml are: 1) observationInputFeed.xml <feed xmlns='uri:falcon:feed:0.1' name='observationInputFeed' description='This is observation table'>
<tags>table=observation</tags>
<frequency>hours(1)</frequency>
<timezone>UTC</timezone>
<clusters>
<cluster name='hiveCluster' type='source'>
<validity start='2016-05-31T09:00Z' end='2016-06-06T09:00Z'/>
<retention limit='days(1)' action='delete'/>
<table uri='catalog:falconexample:observation1#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/>
</cluster>
</clusters>
<table uri='catalog:falconexample:observation1#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/>
<ACL owner='ambari-qa' group='users' permission='0755'/>
<schema location='hcat' provider='hcat'/> ______________________________________________ 2) patientInputFeed.xml <feed xmlns='uri:falcon:feed:0.1' name='patientInputFeed' description='This is Patient table'>
<tags>table=patient</tags>
<frequency>hours(1)</frequency>
<timezone>UTC</timezone>
<clusters>
<cluster name='hiveCluster' type='source'>
<validity start='2016-05-31T09:00Z' end='2016-06-06T09:00Z'/>
<retention limit='days(1)' action='delete'/>
<table uri='catalog:falconexample:patient1#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/>
</cluster>
</clusters>
<table uri='catalog:falconexample:patient1#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/>
<ACL owner='ambari-qa' group='users' permission='0755'/>
<schema location='hcat' provider='hcat'/>
</feed> ____________________________________________ 3) patientprocessedOutputFeed.xml <feed xmlns='uri:falcon:feed:0.1' name='patientprocessedOutputFeed' description='This is patientprocessed table'>
<tags>table=observationInputFeed</tags>
<frequency>hours(1)</frequency>
<timezone>UTC</timezone>
<clusters>
<cluster name='hiveCluster' type='source'>
<validity start='2016-05-31T09:00Z' end='2016-06-06T09:00Z'/>
<retention limit='days(1)' action='delete'/>
<table uri='catalog:falconexample:Patient_proce#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/>
</cluster>
</clusters>
<table uri='catalog:falconexample:Patient_proce#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/>
<ACL owner='ambari-qa' group='users' permission='0755'/>
<schema location='hcat' provider='hcat'/>
</feed> _____________________________________________ 4) And 4th one is also very similar to above feed xml except table name. ______________________________________________ Please help my guys, i dont understand why oozie throwing ClassNotFoundException message:java.lang.ClassNotFoundException even though jar is present at proper location.
... View more
Labels:
- Labels:
-
Apache Falcon
-
Apache Oozie
05-31-2016
12:16 PM
Thank you Saktheesh, <table uri="catalog:tmp_rishav:rec_count_tbl#feed_date=${YEAR}-${MONTH}-${DAY}" />
but can I remove "feed_date=${YEAR}-${MONTH}-${DAY}" from above statement and use it in falcon.
Is mandatory to use in falcon?
... View more
05-31-2016
06:17 AM
I have ran two or more examples of hive data pipeline line using Apache Falcon by creating Hive table URI(input feed/output feed) Feed.Here are the problem statements, statements: 1) Inserting data from one hive table to another table. 2)loading data from HDFS to hive table. Above both data pipeline are running perfectly but now my requirement is somewhat like, Requirement:- I have 3 external hive tables(without partition),written 1 SELECT query on top of that and wants to load data into another hive table using Falcon.I know, I will have to use INSERT OVERWRITE INTO TABLE table2...... SELECT col1,col2 from table1 query but a question is Does Falcon will allow me to create TABLE URI FEED on without partition table? ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- Tables: 1) patient 2)Observation 3)DiagnosticReport ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- SELECT query: select p.id,p.gender, p.Age, p.birthdate, o.component[1].valuequantity.value, o.component[1].valuequantity.unit
from (select *, floor(datediff(to_date(from_unixtime(unix_timestamp())), to_date(birthdate)) / 365.25) as Age FROM patient) p inner join DiagnosticReport d on p.id = substr(d.subject.reference,9) inner join Observation o on p.id = substr(o.subject.reference,9)
where p.Age>17 and p.Age<86 and o.component[1].valuequantity.value <140; Thanks in advance, Please help me.
... View more
Labels:
- Labels:
-
Apache Falcon
-
Apache Hive
05-28-2016
10:12 AM
Thank you Ryan, This will really help me.One more question, 1) Do we need to stick to beeline to get the lineage in atlas? I meant,there is no way to come out of this? 2) The demo given on hortonworks website for atlas-ranger is works fine in sandbox,so is it like that atlas and ranger works only for that demo? Because i tried to add some ranger policy in integration with atlas but those are not working properly. I have used same user hr_user and hr_admin to define ranger and atlas policy.
... View more
05-27-2016
06:06 PM
Hello guys, My atlas-ranger machine is up and all services are running smothly over the node. A question is, atlas just displaying lineage for hive,if we fire a query through beeline.I meant,it's providing lineage/metadata for hive tables only when we perform hive operation by connecting to hiveserver2 using beeline(i.e, only for jdbc connection). Why it's not capturing hive metadata,when we use hive cli? What I should do to resolve this problem?
... View more
Labels:
- Labels:
-
Apache Atlas
-
Apache Hive
05-26-2016
03:36 PM
Hey Lubin, I am also facing same issue in Atlas-Ranger vm.Actually i am not getting lineage for hive tables but if i run the sqoop demo which is present on that machine then i can able to see lineage otherwise not. I also wonder that why this is happening with atlas.
... View more
05-24-2016
08:46 AM
First of All Thank you Dave, As per my knowledge,Atlas 0.7 provides this feature.But again One Question for you: I have running Falcon data pipeline which INSERT data from one hive(table1) table to another Hive(table2)table. The Script is: INSERT OVERWRITE TABLE falconexample.table2 PARTITION (${falcon_output_partitions_hive})
SELECT id,firstName,designation,department FROM falconexample.table1 WHERE ${falcon_input_filter}; My Question: I have checked in the Atlas UI and able to see lineage or metadata only for actions which are performed within hive such as(CREATE TABLE,INSERT OVERWRITE etc). But Nether I found metadata for falcon nor it's lineage. I know am behaving like fool but this is very important for me. Again,I am asking you same question as earlier I have asked but it's necessary for me. Please Help me.
... View more
05-23-2016
08:51 AM
Hello Everyone, My Data pipeline is running perfectly in Apache Falcon. Data Pipeline Problem Statement: I am loading data from HDFS location to Hive table(Partitioned table). Question: I didn't seen lineage for this whole process.I mean,Atlas doesn't showed me any type of icon in lineage diagram. Why Apache Atlas do not showing me lineage for falcon? Is it supported in Atlas(Version 0.5)? Thanks in Advance.
... View more
Labels:
- Labels:
-
Apache Atlas
-
Apache Falcon