Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

​How to scale SplitJson queues?

Solved Go to solution

​How to scale SplitJson queues?

New Contributor

80478-nifi-test-template-scale.png

I have processes that capture data from a SGDB => Converts to AvroJSON => SpliteJSON => Publish in Google PUBSUB

But it is accumulating and I would like to escalate the queues during the Split (putting 3 processors) and lasts the publish in Google it's possible?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: ​How to scale SplitJson queues?

Super Guru
@Bruno Gomes de Souza

Make use of Record oriented processors to do your split on json array,

Try with the below approach

79457-flow.png

Once you feed the success relation to SplitRecord processor then you need to define RecordReader Controller service to read the contents of flowfile and Record Writer as JsonRecordSetWriter.

Mention the Records per split property value as 1 and feed only the splits relationship from SplitRecord processor to PublishGCPubsub processor.

If you find any OOM issues then it's better to use Series of SplitRecord processors to Make Records Per split to 1 message into each flowfile.

Refer to this and this links regarding usage of series of split processors.

Refer to this link regarding configuring Record Reader/Writer controller services.

-

View solution in original post

2 REPLIES 2
Highlighted

Re: ​How to scale SplitJson queues?

Super Guru
@Bruno Gomes de Souza

Make use of Record oriented processors to do your split on json array,

Try with the below approach

79457-flow.png

Once you feed the success relation to SplitRecord processor then you need to define RecordReader Controller service to read the contents of flowfile and Record Writer as JsonRecordSetWriter.

Mention the Records per split property value as 1 and feed only the splits relationship from SplitRecord processor to PublishGCPubsub processor.

If you find any OOM issues then it's better to use Series of SplitRecord processors to Make Records Per split to 1 message into each flowfile.

Refer to this and this links regarding usage of series of split processors.

Refer to this link regarding configuring Record Reader/Writer controller services.

-

View solution in original post

Highlighted

Re: ​How to scale SplitJson queues?

New Contributor

Thanks very much @Shu

Don't have an account?
Coming from Hortonworks? Activate your account here