Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar

Issue:

Whilst running a 'select count(*)' or 'analyze table compute statistics' on a regex table & if you see the below error, it indicates you have created the table with old serde. Prior to Hive 0.10, 'RegexSerDe' is part of 'hive-contrib' library. From Hive 0.10 onwards the serde is part of 'hive-serde-<version>.jar'

ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1496529203645_0883_2_00, diagnostics=[Task failed, taskId=task_1496529203645_0883_2_00_000010, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
        at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
        at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
        at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Map operator initialization failed
        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
        at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
        ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.RegexSerDe not found
        at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:350)
        at org.apache.hadoop.hive.ql.exec.MapOperator.setChildren(MapOperator.java:385)
        at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:224)
        ... 15 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.contrib.serde2.RegexSerDe not found
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2114)
        at org.apache.hadoop.hive.ql.plan.PartitionDesc.getDeserializer(PartitionDesc.java:143)
        at org.apache.hadoop.hive.ql.exec.MapOperator.getConvertedOI(MapOperator.java:316)
        ... 17 more

Resolution:

Create the table with latest serde 'org.apache.hadoop.hive.serde2.RegexSerDe' instead of 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'

You can also run alter table to modify the serde as below

ALTER TABLE <TABLENAME> SET SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe';

Hope you like the article

8,227 Views
Comments

Hi, Thanks.

@knarendran Thank you so much, this works!