Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Transfer json data from external table to internal table

Highlighted

Transfer json data from external table to internal table

i'm trying to fetch the data from external table and insert it into internal table. This external table points to a folder having json files in it. i have written the query in file and used it in workflow. Both the tables have same columns and datatype.when i run my query in query editor it works.But when i run the same query by adding it to a workflow it gives me below error.

I have set some workflow properties as shown in image.

11406-workflow-properties.png

I have also added below jar file file in '/user/oozie/share/lib/lib_20160808184114/hive' path.

I have hortanworks 2.4.2,oozie 4.2.0 ,Hive-Hcatalog 1.2.1000 , Hue 2.6.1-258

json-serde-1.3.6-jar-with-dependencies.jar , mysql-connector-java.jar , hive-serde-1.2.1000.2.4.2.0-258.jar ,hive-hcatalog-core-0.13.1.jar

ERROR::

INFO [ATS Logger 0] impl.TimelineClientImpl (TimelineClientImpl.java:logException(273)) - Exception caught by TimelineClientConnectionRetry, will try 29 more time(s). Message: java.net.ConnectException: Connection refused 2017-01-13 13:45:08,209 INFO [ATS Logger 0] impl.TimelineClientImpl (TimelineClientImpl.java:logException(273)) - Exception caught by TimelineClientConnectionRetry, will try 4 more time(s). Message: java.net.ConnectException: Connection refused 2017-01-13 13:45:09,215 INFO [ATS Logger 0] impl.TimelineClientImpl (TimelineClientImpl.java:logException(273)) - Exception caught by TimelineClientConnectionRetry, will try 3 more time(s). Message: java.net.ConnectException: Connection refused 2017-01-13 13:45:09,618 INFO [main] SessionState (SessionState.java:printInfo(953)) - Map 1: 0(+0,-10)/3Reducer 2: 0/1Reducer 3: 0/1 2017-01-13 13:45:09,826 ERROR [main] SessionState (SessionState.java:printError(962)) - Status: Failed 2017-01-13 13:45:09,828 ERROR [main] SessionState (SessionState.java:printError(962)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1483604431963_0816_1_00, diagnostics=[Task failed, taskId=task_1483604431963_0816_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed 2017-01-13 13:45:08,209 INFO [ATS Logger 0] impl.TimelineClientImpl (TimelineClientImpl.java:logException(273)) - Exception caught by TimelineClientConnectionRetry, will try 4 more time(s). Message: java.net.ConnectException: Connection refused 2017-01-13 13:45:09,215 INFO [ATS Logger 0] impl.TimelineClientImpl (TimelineClientImpl.java:logException(273)) - Exception caught by TimelineClientConnectionRetry, will try 3 more time(s). Message: java.net.ConnectException: Connection refused 2017-01-13 13:45:09,618 INFO [main] SessionState (SessionState.java:printInfo(953)) - Map 1: 0(+0,-10)/3Reducer 2: 0/1Reducer 3: 0/1 2017-01-13 13:45:09,826 ERROR [main] SessionState (SessionState.java:printError(962)) - Status: Failed 2017-01-13 13:45:09,828 ERROR [main] SessionState (SessionState.java:printError(962)) - Vertex failed, vertexName=Map 1, vertexId=vertex_1483604431963_0816_1_00, diagnostics=[Task failed, taskId=task_1483604431963_0816_1_00_000000, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:344) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:181) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:265) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.data.JsonSerDe not found at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:347) at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:382) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:227) ... 15 more Caused by: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.data.JsonSerDe not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:143) at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:313) ... 17 more ], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed

3 REPLIES 3
Highlighted

Re: Transfer json data from external table to internal table

Mentor

Your hive-hcatalog-core jar is for Hive 0.13 whereas everything else is give 1.2.1. Make sure you're not mixing libraries. Also, I would do the following in your job.Properties

oozie.action.sharelib.for.hive=hcatalog,hive
Highlighted

Re: Transfer json data from external table to internal table

hi , Artem.Thank you for your reply.

As you suggested, i did changes .Repleaced the hive-hcatalog-core with required version.and changed the job property you mentioned.But still i get errors.

When i replaced the hive-hcatalog-core 0.13.jar with hive-hcatalog-core1.2.1.jar it gave error saying hive-hcatalog-core 0.13.jar NOT Found. Why it is asking for non existing file.? Does it cache some Jar files required to perform workflow action.

Also why i get below error.

org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.data.JsonSerDe not found.

Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:2, Vertex vertex_1484314765213_0102_1_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]

Highlighted

Re: Transfer json data from external table to internal table

Mentor

you need to update your sharelib to make sure it picks up the latest jar.

Don't have an account?
Coming from Hortonworks? Activate your account here