Created on 06-26-2017 08:22 PM
Issue:
Whilst running a 'select count(*)' or 'analyze table compute statistics' on a regex table & if you see the below error, it indicates you have created the table with old serde. Prior to Hive 0.10, 'RegexSerDe' is part of 'hive-contrib' library. From Hive 0.10 onwards the serde is part of 'hive-serde-<version>.jar'
ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1496529203645_0883_2_00, diagnostics=[Task failed, taskId=task_1496529203645_0883_2_00_000010, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Map operator initialization failed
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.RegexSerDe not found
at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:350)
at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:385)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:224)
... 15 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.RegexSerDe not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2114)
at org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:143)
at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:316)
... 17 moreResolution:
Create the table with latest serde 'org.apache.hadoop.hive.serde2.RegexSerDe' instead of 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
You can also run alter table to modify the serde as below
ALTER TABLE <TABLENAME> SET SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe';
Hope you like the article
Created on 06-27-2017 09:10 AM
Hi, Thanks.
Created on 06-28-2017 09:39 AM
@knarendran Thank you so much, this works!