Created 05-31-2016 05:24 PM
Hello guys,
Currently,my requirement is to process data of 3 hive tables and store it's result into another hive table.Therefore,I have created 3 hive uri input feed ,1 hive uri output feed and 1 process which is accepting this 3 feed entity as input and genereting output in output feed.Process entity giving error as:
FAILED: RuntimeException MetaException(message:java.lang.ClassNotFoundException Classorg.openx.data.jsonserde.JsonSerDe not found)
Actually i am getting this error in oozie when scheduled process entity gets failed
I understood that error is so simple and it's caused because of serDe jar is missing somewhere within hive/lib or oozir/share/lib.
I tried the following solutions:
1) add a jar to hive lib folder or add it to oozie lib.
Jar:
json-serde-1.3.8-SNAPSHOT-jar- with-dependencies.jar;.
But still getting same error.if i add the jar through hive cli using ADD JAR command then query run smoothly there.
My xml are:
1) observationInputFeed.xml
<feed xmlns='uri:falcon:feed:0.1' name='observationInputFeed' description='This is observation table'> <tags>table=observation</tags> <frequency>hours(1)</frequency> <timezone>UTC</timezone> <clusters> <cluster name='hiveCluster' type='source'> <validity start='2016-05-31T09:00Z' end='2016-06-06T09:00Z'/> <retention limit='days(1)' action='delete'/> <table uri='catalog:falconexample:observation1#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/> </cluster> </clusters> <table uri='catalog:falconexample:observation1#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/> <ACL owner='ambari-qa' group='users' permission='0755'/> <schema location='hcat' provider='hcat'/>
______________________________________________
2) patientInputFeed.xml
<feed xmlns='uri:falcon:feed:0.1' name='patientInputFeed' description='This is Patient table'> <tags>table=patient</tags> <frequency>hours(1)</frequency> <timezone>UTC</timezone> <clusters> <cluster name='hiveCluster' type='source'> <validity start='2016-05-31T09:00Z' end='2016-06-06T09:00Z'/> <retention limit='days(1)' action='delete'/> <table uri='catalog:falconexample:patient1#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/> </cluster> </clusters> <table uri='catalog:falconexample:patient1#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/> <ACL owner='ambari-qa' group='users' permission='0755'/> <schema location='hcat' provider='hcat'/> </feed>
____________________________________________
3) patientprocessedOutputFeed.xml
<feed xmlns='uri:falcon:feed:0.1' name='patientprocessedOutputFeed' description='This is patientprocessed table'> <tags>table=observationInputFeed</tags> <frequency>hours(1)</frequency> <timezone>UTC</timezone> <clusters> <cluster name='hiveCluster' type='source'> <validity start='2016-05-31T09:00Z' end='2016-06-06T09:00Z'/> <retention limit='days(1)' action='delete'/> <table uri='catalog:falconexample:Patient_proce#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/> </cluster> </clusters> <table uri='catalog:falconexample:Patient_proce#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}'/> <ACL owner='ambari-qa' group='users' permission='0755'/> <schema location='hcat' provider='hcat'/> </feed>
_____________________________________________
4) And 4th one is also very similar to above feed xml except table name.
______________________________________________
Please help my guys,
i dont understand why oozie throwing ClassNotFoundException message:java.lang.ClassNotFoundException even though jar is present at proper location.
Created 06-01-2016 12:56 PM
Hello Guys,
The error is has been solved ,I have solved it by adding additional statement in hive script along with above query as
statement:-
add jar hdfs://<hostname>:8020//user/oozie/share/lib/lib_20160503082834/hive/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;
Created 05-31-2016 05:52 PM
Try restarting Hive to pick up the jar.
Created 06-01-2016 02:30 AM
Thanks cnormile,
Could you please tell me,generally at which location we should put that serDe jar so that falcon/oozie will also pick that jar while running the data pipeline.
As I already mentioned that jar is present at both the locations(i.e, In user/oozie/share/lib/hive and usr/hdp/hdp-<version>/hive/lib) and restarted already hive and oozie both but stiill prompting same error.
Created 06-01-2016 02:38 AM
Jar file is missing:
0: jdbc:hive2://192.168.56.101:10000> ADD JAR /tmp/hive-json-serde-0.2.jar; No rows affected (0.231 seconds)
0: jdbc:hive2://192.168.56.101:10000> select * from my_table;
Link might give you more example.
Created 06-01-2016 10:51 AM
Thank you Sri,
I followed the link which you have sent to me but still getting same error
1) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.openx.data.jsonserde.JsonSerDe not found.
Actually query which am running using falcon is,
INSERT OVERWRITE TABLE falconexample.Patient_proce PARTITION (${falcon_output_partitions_hive}) select p.id,p.gender, p.Age, p.birthdate, o.component[1].valuequantity.value, o.component[1].valuequantity.unit from (select *, floor(datediff(to_date(from_unixtime(unix_timestamp())), to_date(birthdate)) / 365.25) as Age FROM falconexample. patient1) p inner join falconexample.DiagnosticReport1 d on p.id = substr(d.subject.reference,9) inner join falconexample.Observation1 o on p.id = substr(o.subject.reference,9) where p.Age>17 and p.Age<86 and o.component[1].valuequantity.value <140;
If I write statement in hive script as "ADD JAR /user/oozie/share/lib/lib_20160503082834/hive/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;" then I get an error like
2) java.lang.IllegalArgumentException: /user/oozie/share/lib/lib_20160503082834/hive/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar does not exist.
and if I remove that statement then I get 1st error.
Following solutions I have tried to resolve this issue.
1)added serDe jar at oozie share lib folder.(HDFS location)
2)added serDe jar at hive lib(Local FS).
3)added serDe jat at "falcon.libpath" location(i.e,/apps/falcon/pigCluster/working/lib)(HDFS location)
4)added jar at /apps/falcon/pigCluster/staging/falcon/workflows/process/patientDataProcess/6b7edfbfe5bcfc50e0fc845f71cd9122_1464767974691/DEFAULT/lib(HDFS location)
every where I have put the jar but again getting 1st error.I dont know what is happening with this stuff?
why falcon does not getting that jar?
I am posting log file which I found under falcon working direcoty user-action-hive-failed.txt
Created 06-01-2016 12:56 PM
Hello Guys,
The error is has been solved ,I have solved it by adding additional statement in hive script along with above query as
statement:-
add jar hdfs://<hostname>:8020//user/oozie/share/lib/lib_20160503082834/hive/json-serde-1.3.8-SNAPSHOT-jar-with-dependencies.jar;