Reply
Highlighted
New Contributor
Posts: 5
Registered: ‎08-11-2017

INSERT INTO fails after moving to DbTxnManager (CDH 5.12.0)

[ Edited ]

Hello.

 

After changing our hive-site.xml to allow ACID operations in ORC tables, we are running into an issue that I can't find documented anywhere.

 

If we run an INSERT INTO query with certain columns, we get this error while compiling the statement:

 

2017-08-24 11:09:54,032 WARN  org.apache.hive.service.cli.thrift.ThriftCLIService: [HiveServer2-Handler-Pool: Thread-1085]: Error executing statement:
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: NullPointerException null
        at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:187)
        at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:271)
        at org.apache.hive.service.cli.operation.Operation.run(Operation.java:337)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:439)
        at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:416)
        at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:282)
        at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:501)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
        at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:746)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genReduceSinkPlanForSortingBucketing(SemanticAnalyzer.java:7203)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBucketingSortingDest(SemanticAnalyzer.java:6165)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFileSinkPlan(SemanticAnalyzer.java:6304)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPostGroupByBodyPlan(SemanticAnalyzer.java:9093)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:8984)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9851)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:9744)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genOPTree(SemanticAnalyzer.java:10217)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10228)
        at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10108)
        at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:223)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:558)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1356)
        at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1343)
        at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:185)
        ... 15 more

 

But INSERT INTO with all the columns, works as expected.

 

Here is a sample:

 

 

CREATE TABLE h1_test2(id int, id2 string) 
    CLUSTERED by (id) into 8 buckets 
    STORED AS orc TBLPROPERTIES ('transactional'='true');

INSERT INTO TABLE h1_test2 VALUES(1,'abc'),(2,'def'),(3,'ghi');

-- Works:
INSERT INTO TABLE h1_test2 (id, id2) VALUES(8, 'basfd');

-- Fails:
INSERT INTO TABLE h1_test2 (id) VALUES(9);

These are the new properties added to Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml:

 

<property>
    <name>hive.enforce.bucketing</name>
    <value>true</value>
</property>
<property>
    <name>hive.exec.dynamic.partition.mode</name>
    <value>nonstrict</value>
</property>
<property>
    <name>hive.txn.manager</name>
    <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
</property>
<property>
    <name>hive.compactor.initiator.on</name>
    <value>true</value>
</property>
<property>
    <name>hive.compactor.worker.threads</name>
    <value>2</value>
</property>

Do you know where this could be coming from?

 

Thanks! :)