About lightsailpro

lightsailpro · ‎05-23-2018

Finally figured out.need to set hive.tez.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

lightsailpro · ‎05-23-2018

--1st error is: "Caused by: java.io.IOException: [Error 30022]: Must use HiveInputFormat to read ACID tables (set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat)". --After I set set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat, I got second error: "Caused by: java.io.IOException: [Error 30021]: An ORC ACID reader required to read ACID tables". --Set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager, make no difference --I tried compact table all partitions major, still get same error. Any suggestion for working around the error is appreciated. Here the table definition: CREATE TABLE `raw_orc1` ( `uuid` string, token string, ip string, event string ) PARTITIONED BY ( `rk` string, `dt` string ) CLUSTERED BY (token) INTO 10 BUCKETS STORED AS ORC TBLPROPERTIES ( 'transactional'='true' ); at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:213) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719) at org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:149) at org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:80) at org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:674) at org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:633) at org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:145) at org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:109) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:405) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:124) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149) ... 14 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252) ... 26 more Caused by: java.io.IOException: [Error 30021]: An ORC ACID reader required to read ACID tables at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.raiseAcidTablesMustBeReadWithAcidReaderException(OrcInputFormat.java:265) at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.<init>(VectorizedOrcInputFormat.java:70) at org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat.getRecordReader(VectorizedOrcInputFormat.java:177) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createVectorizedReader(OrcInputFormat.java:1309) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1322) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67) ... 31 more

lightsailpro · ‎05-22-2018

hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager is set in the configuration indeed. I run the select count(*) from the same session that "insert into" was done. "insert into" was fine. I created a similar acid table but without partition. Query runs fine. So it seems something related to partition.

lightsailpro · ‎05-22-2018

I created a table as below. Then ingest date by rk and dt. After two separate insert into, when I run a select count(*), I keep getting the following error: "Caused by: java.io.IOException: [Error 30021]: An ORC ACID reader required to read ACID tables" What does this error mean, how do I work around this? set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat; set hive.exec.dynamic.partition=true; set hive.exec.dynamic.partition.mode=nonstrict; CREATE TABLE `table1_orc` ( `uuid` string, token string, ip_address string, `raw_event` string ) PARTITIONED BY ( `rk` string, `dt` string ) CLUSTERED BY (token) INTO 10 BUCKETS STORED AS ORC TBLPROPERTIES ( 'transactional'='true');

lightsailpro · ‎11-16-2016

End up modifying SplitJson.java to include original content as below: {"RESULT":[{"SPLIT":{ }, "ORIGINAL":{ }]}

lightsailpro · ‎11-14-2016

How do I join the split flow stream with the Original flow then? As I mentioned I need to be able to get the upper level attributes (contained in the original json) of the split node from the Split flow.

lightsailpro · ‎11-14-2016

In the SplitJson processor, is there any way to pass the Original flow to Split flow as an attribute, or reference the original in expression in the Split flow? In my case, the Json node to split is not on the root, but I need the root attribute over to the Split flow. Thanks.

lightsailpro · ‎11-08-2016

In the following JSON, { "p":{ "key":"k1", "theme":"default" }, "version":"1.1.0", "s":[ { "x":1, "y":"0.1" }, { "x":2, "y":"0.2" } ] } I want to split node "s" array, but I'd like to include root level attribute p.key, p.theme in the split "s" flowfile, see below. How do I do this in Nifi? k1, default, 1, 0.1 k1, default, 2, 0.2 Thanks

Online	Offline
Last Visited	‎05-25-2018 01:22 PM

Member Since	‎11-08-2016 08:08 PM
Last Visited	‎05-25-2018 01:22 PM
Posts	19
Kudos received	3

Cloudera Community

Re: ORC acid table "select count(*) error" - java...

Re: ORC acid table "select count(*) error" - java...

Re: ORC acid table "select count(*) error" - java...

Re: ORC acid table "select count(*) error" - java...

ORC acid table "select count(*) error" - java.io....

Re: Nifi SplitJson - how to access Original flow f...

Re: Nifi SplitJson - how to access Original flow f...

Nifi SplitJson - how to access Original flow from ...

NIFI - How to split non root node (json array), bu...