Member since
11-08-2016
19
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2940 | 05-23-2018 05:13 PM |
05-23-2018
05:13 PM
Finally figured out.need to set hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
... View more
05-23-2018
01:06 PM
--1st error is: "Caused by: java.io.IOException: [Error 30022]: Must use HiveInputFormat to read ACID tables (set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat)". --After I set set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat, I got second error: "Caused by: java.io.IOException: [Error 30021]: An ORC ACID reader required to read ACID tables". --Set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager, make no difference --I tried compact table all partitions major, still get same error. Any suggestion for working around the error is appreciated. Here the table definition: CREATE TABLE `raw_orc1`
(
`uuid` string,
token string,
ip string,
event string
)
PARTITIONED BY
(
`rk` string,
`dt` string
)
CLUSTERED BY (token) INTO 10 BUCKETS
STORED AS ORC
TBLPROPERTIES (
'transactional'='true'
); at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:213)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333)
at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719)
at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149)
at org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80)
at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:674)
at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:633)
at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145)
at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:405)
at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:124)
at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
... 26 more
Caused by: java.io.IOException: [Error 30021]: An ORC ACID reader required to read ACID tables
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.raiseAcidTablesMustBeReadWithAcidReaderException(OrcInputFormat.java:265)
at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.<init>(VectorizedOrcInputFormat.java:70)
at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat.getRecordReader(VectorizedOrcInputFormat.java:177)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createVectorizedReader(OrcInputFormat.java:1309)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1322)
at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
... 31 more
... View more
05-22-2018
07:59 PM
hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager is set in the configuration indeed. I run the select count(*) from the same session that "insert into" was done. "insert into" was fine. I created a similar acid table but without partition. Query runs fine. So it seems something related to partition.
... View more
05-22-2018
12:26 PM
I created a table as below. Then ingest date by rk and dt. After two separate insert into, when I run a select count(*), I keep getting the following error: "Caused by: java.io.IOException: [Error 30021]: An ORC ACID reader required to read ACID tables" What does this error mean, how do I work around this? set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict; CREATE TABLE `table1_orc`
(
`uuid` string,
token string,
ip_address string,
`raw_event` string
)
PARTITIONED BY
(
`rk` string,
`dt` string
)
CLUSTERED BY (token) INTO 10 BUCKETS
STORED AS ORC
TBLPROPERTIES (
'transactional'='true');
... View more
Labels:
- Labels:
-
Apache Hive
11-16-2016
01:26 PM
End up modifying SplitJson.java to include original content as below: {"RESULT":[{"SPLIT":{ }, "ORIGINAL":{ }]}
... View more
11-14-2016
09:24 PM
How do I join the split flow stream with the Original flow then? As I mentioned I need to be able to get the upper level attributes (contained in the original json) of the split node from the Split flow.
... View more
11-14-2016
02:00 PM
In the SplitJson processor, is there any way to pass the Original flow to Split flow as an attribute, or reference the original in expression in the Split flow? In my case, the Json node to split is not on the root, but I need the root attribute over to the Split flow. Thanks.
... View more
Labels:
- Labels:
-
Apache NiFi
11-08-2016
08:13 PM
2 Kudos
In the following JSON,
{
"p":{
"key":"k1",
"theme":"default"
},
"version":"1.1.0",
"s":[
{
"x":1,
"y":"0.1"
},
{
"x":2,
"y":"0.2"
}
]
} I want to split node "s" array, but I'd like to include root level attribute p.key, p.theme in the split "s" flowfile, see below. How do I do this in Nifi? k1, default, 1, 0.1 k1, default, 2, 0.2 Thanks
... View more
Labels:
- Labels:
-
Apache NiFi