About stephane_davy

stephane_davy · ‎12-21-2020

Hello @Lokeswar The queryRecord processor does exactly what you want. But you need to have your job in a record-oriented approach, using json reader and json writter. Then you don't work with flow-file attribute, but directly with your flow-file content. So, you have your CSV: - use a convertRecord to transform it in record flow-file, using CSVreader as reader, and JSONTreeWriter as out writer as you want JSON - add a queryRecord processor, and your query should look like this: SELECT * FROM FLOWFILE WHERE resourceState='INSTALLED' OR resourceState='RETIRED' warning: don't use double quote for values, just simple quote After this, you will have a condition at the output of the queryrecord that you can plugto your next processor.

stephane_davy · ‎12-17-2020

hello @spa I've been looking for this also but it doesn't exist. Then, you can use a script (python, groovy,...) In case you have performance issue with scriptprocessor, you can improve the situation using the trick here: InvokeScriptedProcessor template (a faster ExecuteScript)

stephane_davy · ‎12-17-2020

Hi @justenji Same for me, I've tried to use avro schema generator, including the inferschema from Nifi, but no luck.

stephane_davy · ‎12-16-2020

hello @Anurag007 Your description is a little bit 'dry'. Anyway, you can probably do what you want with the following processors: - getFile (or better, listFile + fetchFile) to get the content of your files - routeOnContent, which allows you to define some routing rules based on file content using regexp You will find easily many examples of how to use these processors, probably using the search feature of this site

stephane_davy · ‎12-16-2020

Hello @justenji Thanks a lot for the time you spend on my issue, I really appreciate. Yes, as I mentionned at the beguinning of my post, it works with basic JoltTranformJSON on a single JSON entry, and this is what I'm doing now: split my records and then use this processor. But I want to keep the record-oriented approach which is really more efficient regarding performances. I wanted to test some different thing regarding schema, as suggested by @TimothySpann . I guess we need to tell Jolt that the output will be an array of record. I've tried various attempts with avro schema but no luck. Actually, I've even tried to use inferSchema to create a schema, but the AvroRegistrySchema doesn't want to take take it, and the error message I have is "Not a named Type" Here is the basic avro schema: { "type": "array", "namespace":"nothing", "items": { "type": "record", "name": "steps", "fields": [ { "name": "index", "type": "string", "doc": "Type inferred from index" } ] } } Do we have avro guru around the corner? Thanks Stéphane

stephane_davy · ‎12-15-2020

Hello @TimothySpann Thanks for your reply. I use a basic JsonTreeReader with no schema, just infer schema.

stephane_davy · ‎12-14-2020

Hello, I'm facing a weird issue with jolt. I have a flowfile which is record-oriented, one JSON object per line with the following structure: {"aleas": [{object1}, {object2}, {object3}]} and why I basically want to do is to get rid of this "aleas" root key andhave something like this: [{object1}, {object2}, {object3}] I've tested this spec on the Jolt demo site: [ { "operation": "shift", "spec": { "aleas": { "*": [] } } } ] But when I run it on Nifi (lastest release) using a JoltTransformRecord processor, I get the following error message: 2020-12-15 07:50:17,415 ERROR [Timer-Driven Process Thread-8] o.a.n.p.jolt.record.JoltTransformRecord JoltTransformRecord[id=654dabc3-0176-1000-0c3a-067d307c6f07] Unable to transform StandardFlowFileRecord[uuid=b818aa99-b538-48bb-942e-c39d70854c53,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1608018617233-9, container=default, section=9], offset=570949, length=1329453],offset=0,name=60dcc444-f06a-4c65-b667-8309583eb782_Feuil1.csv,size=1329453] due to org.apache.nifi.processor.exception.ProcessException: Error transforming the first record: org.apache.nifi.processor.exception.ProcessException: Error transforming the first record org.apache.nifi.processor.exception.ProcessException: Error transforming the first record at org.apache.nifi.processors.jolt.record.JoltTransformRecord.onTrigger(JoltTransformRecord.java:335) at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27) at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1174) at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:213) at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117) at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) I use a basic jsonTreeReader as record reader, with all the options set to default. The funny part is that if I put a split record processor and process each JSON flowfile using JoltTransformJSON, it works nicely. Neverheless, I'd like to avoid this solution which is really bad for performance and breaks my whole "record-oriented" flow. Any idea? Thaks for your support Stéphane

stephane_davy · ‎12-14-2020

Hello @justenji Thanks for the detailed explanation, this is also what I have tested on my side and in my opinion this does the job being "record oriented" 🙂 Have a nice day

stephane_davy · ‎12-09-2020

I need to test this, but actually it seems that the queryRecord is exactly why I need. It is possible to build some complex conditions using SQL-like language on the record content and then perform some routing based on that

stephane_davy · ‎12-09-2020

Hi @TimothySpann Thanks so much for pointing me to this processor, it looks so great!!

Online	Offline
Last Visited	‎08-16-2021 03:54 AM

Member Since	‎12-11-2017 01:36 PM
Last Visited	‎08-16-2021 03:54 AM
Posts	21
Kudos received	4

Cloudera Community

Re: Read CSV File with Header and Filter rows and ...

Re: Apache Nifi: filter recordpath on boolean fiel...

Re: Read CSV File with Header and Filter rows and ...

Re: Nifi: KV filter

Re: Spec working for JoltTransformJSON but not Jol...

Re: how to extract specific lines in nifi from log...

Re: Spec working for JoltTransformJSON but not Jol...

Re: Spec working for JoltTransformJSON but not Jol...

Spec working for JoltTransformJSON but not JoltTra...

Re: [Nifi] - Multiple predicate in recordpath filt...

Re: [Nifi] - Multiple predicate in recordpath filt...

Re: NIFI - How to route a record\event based on co...