Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

INSERT INTO fails after moving to DbTxnManager (CDH 5.12.0)

Highlighted

INSERT INTO fails after moving to DbTxnManager (CDH 5.12.0)

New Contributor

Hello.

 

After changing our hive-site.xml to allow ACID operations in ORC tables, we are running into an issue that I can't find documented anywhere.

 

If we run an INSERT INTO query with certain columns, we get this error while compiling the statement:

 

2017-08-24 11:09:54,032 WARN  org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: Thread-1085]: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: NullPointerException null
        at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:187)
        at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:271)
        at org.apache.hive.service.cli.operation.Operation.run(Operation.java:337)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:439)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:416)
        at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:282)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:501)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlanForSortingBucketing(SemanticAnalyzer.java:7203)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6165)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6304)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:9093)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8984)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9851)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9744)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10217)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10228)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10108)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:223)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:558)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1356)
        at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1343)
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:185)
        ... 15 more

 

But INSERT INTO with all the columns, works as expected.

 

Here is a sample:

 

 

CREATE TABLE h1_test2(id int, id2 string) 
    CLUSTERED by (id) into 8 buckets 
    STORED AS orc TBLPROPERTIES ('transactional'='true');

INSERT INTO TABLE h1_test2 VALUES(1,'abc'),(2,'def'),(3,'ghi');

-- Works:
INSERT INTO TABLE h1_test2 (id, id2) VALUES(8, 'basfd');

-- Fails:
INSERT INTO TABLE h1_test2 (id) VALUES(9);

These are the new properties added to Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml:

 

<property>
    <name>hive.enforce.bucketing</name>
    <value>true</value>
</property>
<property>
    <name>hive.exec.dynamic.partition.mode</name>
    <value>nonstrict</value>
</property>
<property>
    <name>hive.txn.manager</name>
    <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
</property>
<property>
    <name>hive.compactor.initiator.on</name>
    <value>true</value>
</property>
<property>
    <name>hive.compactor.worker.threads</name>
    <value>2</value>
</property>

Do you know where this could be coming from?

 

Thanks! :)