About dieden9

dieden9 · ‎04-27-2021

Hi, I am using apache nifi's consumeKafka processor to consume IOT data. It is working fine when explicitly specifying the topics. However, I am looking for something more dynamic so I don't have to add topics manually in nifi each time we connect a new device. Is it possible to have wildcards in the kafka topic name field? My topics name look like domain.iot_device_name so I need something like domain.*

dieden9 · ‎11-25-2020

Hi Simon, I am facing the same issue but my case might be different. Are you using temporary credentials? i.e. assuming role. If so, you can't provide aws_session_token property in nifi unfortunately and this will throw the error you are facing. There is an open issue here https://issues.apache.org/jira/browse/NIFI-7900 /Mahmoud

dieden9 · ‎10-14-2020

Hi @balajip thanks for the reply. excuse my ignorace as I am still new to cloudera platform. Is there a config that I can set to override the bucket limit? or should I apply that patch?

dieden9 · ‎10-13-2020

Hi! I am running a scheduled job that consists of an insert-select query in hive 3.0/hdp 3 as the following Insert into table t1 as select * from t2 where timestamp > "predefined timestamp" The job was running flawless until out of sudden it started failing with the following error: Caused by: java.lang.IllegalArgumentException: bucketId out of range: 4147 [2020-10-13 06:58:12,214] INFO - at org.apache.hadoop.hive.ql.io.BucketCodec$2.encode(BucketCodec.java:94) [2020-10-13 06:58:12,214] INFO - at org.apache.hadoop.hive.ql.io.orc.OrcRecordUpdater.<init>(OrcRecordUpdater.java:271) [2020-10-13 06:58:12,214] INFO - at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat.getRecordUpdater(OrcOutputFormat.java:278) [2020-10-13 06:58:12,214] INFO - at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordUpdater(HiveFileFormatUtils.java:350) [2020-10-13 06:58:12,214] INFO - at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getAcidRecordUpdater(HiveFileFormatUtils.java:336) [2020-10-13 06:58:12,214] INFO - at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801) [2020-10-13 06:58:12,214] INFO - at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:741) [2020-10-13 06:58:12,214] INFO - ... 45 more [2020-10-13 06:58:12,214] INFO - ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:3493, Vertex vertex_1602198520469_101787_31_02 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1602198520469_101787_31_03, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:1, Vertex vertex_1602198520469_101787_31_03 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1 (state=08S01,code=2) I am clueless of what is causing this especially that the job hasn't changed. Any idea of how I can solve this issue? 😞

Online	Offline
Last Visited	‎04-30-2021 05:37 AM

Member Since	‎10-12-2020 11:58 PM
Last Visited	‎04-30-2021 05:37 AM
Posts	6
Kudos received	1

Cloudera Community

Consume Kafka topics using wildcard

Re: nifi hdf 3.1.0 ListS3 processor AWS wont accep...

Re: bucketId out of range: 4147

bucketId out of range: 4147