Created 05-22-2018 12:26 PM
I created a table as below. Then ingest date by rk and dt. After two separate insert into, when I run a select count(*), I keep getting the following error: "Caused by: java.io.IOException: [Error 30021]: An ORC ACID reader required to read ACID tables"
What does this error mean, how do I work around this?
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
CREATE TABLE `table1_orc`
(
`uuid` string,
token string,
ip_address string,
`raw_event` string
)
PARTITIONED BY
(
`rk` string,
`dt` string
)
CLUSTERED BY (token) INTO 10 BUCKETS
STORED AS ORC
TBLPROPERTIES (
'transactional'='true');
Created 05-23-2018 05:13 PM
Finally figured out.need to set hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
Created 05-22-2018 06:11 PM
Perhaps:
"Reading/writing to an ACID table from a non-ACID session is not allowed. In other words, the Hive transaction manager must be set to org.apache.hadoop.hive.ql.lockmgr.DbTxnManager in order to work with ACID tables."
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Limitations
Created 05-22-2018 07:59 PM
hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager is set in the configuration indeed. I run the select count(*) from the same session that "insert into" was done. "insert into" was fine.
I created a similar acid table but without partition. Query runs fine. So it seems something related to partition.
Created 05-23-2018 01:06 PM
--1st error is: "Caused by: java.io.IOException: [Error 30022]: Must use HiveInputFormat to read ACID tables (set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat)".
--After I set set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat, I got second error: "Caused by: java.io.IOException: [Error 30021]: An ORC ACID reader required to read ACID tables".
--Set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager, make no difference
--I tried compact table all partitions major, still get same error. Any suggestion for working around the error is appreciated.
Here the table definition:
CREATE TABLE `raw_orc1`
(
`uuid` string,
token string,
ip string,
event string
)
PARTITIONED BY
(
`rk` string,
`dt` string
)
CLUSTERED BY (token) INTO 10 BUCKETS
STORED AS ORC
TBLPROPERTIES (
'transactional'='true'
);
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:213)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719)
at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
at org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:674)
at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:633)
at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:405)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:124)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
... 26 more
Caused by: java.io.IOException: [Error 30021]: An ORC ACID reader required to read ACID tables
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.raiseAcidTablesMustBeReadWithAcidReaderException(OrcInputFormat.java:265)
at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.<init>(VectorizedOrcInputFormat.java:70)
at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat.getRecordReader(VectorizedOrcInputFormat.java:177)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createVectorizedReader(OrcInputFormat.java:1309)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1322)
at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
... 31 more
Created 05-23-2018 05:13 PM
Finally figured out.need to set hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;