Created on 06-26-2017 08:22 PM
Issue:
Whilst running a 'select count(*)' or 'analyze table compute statistics' on a regex table & if you see the below error, it indicates you have created the table with old serde. Prior to Hive 0.10, 'RegexSerDe' is part of 'hive-contrib' library. From Hive 0.10 onwards the serde is part of 'hive-serde-<version>.jar'
ERROR : Status: Failed ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1496529203645_0883_2_00, diagnostics=[Task failed, taskId=task_1496529203645_0883_2_00_000010, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149) ... 14 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.RegexSerDe not found at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:350) at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:385) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:224) ... 15 more Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.RegexSerDe not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2114) at org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:143) at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:316) ... 17 more
Resolution:
Create the table with latest serde 'org.apache.hadoop.hive.serde2.RegexSerDe' instead of 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
You can also run alter table to modify the serde as below
ALTER TABLE <TABLENAME> SET SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe';
Hope you like the article
Created on 06-27-2017 09:10 AM
Hi, Thanks.
Created on 06-28-2017 09:39 AM
@knarendran Thank you so much, this works!