About RahulSoni

Shu_ashu · ‎03-24-2018

@Mark McGowan You can do merging by using Correlation attribute name property in merge content processor, to use this property we cannot use convert record processor because each flowfile we need to keep date,hour attrbiutes as one attribute and use the attribute in merge content correlation attribute name. Then the merge content processor will merge with same attribute flowfiles into one flowfile. Please refer to below links for more details about correlation attribute usage https://community.hortonworks.com/questions/161827/mergeprocessor-nifi-using-the-correlation-attribut.html https://community.hortonworks.com/questions/55926/using-the-mergecontent-processor-in-nifi-can-i-use.html https://community.hortonworks.com/questions/87178/merge-fileflow-files-based-on-time-rather-than-siz.html For reference Convert record xml 178086-json-to-csv.xml

RahulSoni · ‎03-18-2018

@Pramod Kalvala You can schedule multiple jobs at a single given time assuming you have the resource available to cater all of them 🙂 Coming to how can you do that, there are multiple scheduling options available in NiFi. If you right-click on your "Triggering processor", that is the very first processor in your job and click on "Configure", you will see a scheduling tab. You will see an interface as presented below. There you can see "Scheduling Strategy" drop down. There you can see two major scheduling options. Timer Driven Cron Driven Timer Driven strategy executes the processor according to the duration mentioned in the "Run schedule". So in this case, the processor will run every second, triggering the next processors in the flow, provided they don't have a schedule of there own. Cron Driven is the strategy where you can mention a specific time of a day, specific day(s) etc, i.e. the schedule when you can execute that processor. Let's say that you want to run your job @1 PM every day, the Scheduling tab will look something like You can have whatever number of jobs scheduled to run at 1 PM by using the same scheduling strategy for all of them and all of them will run at the same time. All of those jobs will be separate will not interfere with each other, until you instruct them to do so and will run without any issues, provided that you have sufficient resources for all of them.

pramodkalvala5 · ‎03-18-2018

@ Rahul soni - Thanks so much for the detailed explanation.we have a custom scheduler which decides the number of jobs to run and this triggers the oozie API, which triggers the particular workflow like sqoop or Kafka whatever.So this workflow will go get the properties for the respective workflow and run the jobs.We are looking forward to replace the sqoop workflow with nifi.I have been doing some POC’s on nifi lately and realized that it has the capability to handle the load and was not sure how it would work in scheduling jobs like sqoop.Anyway your answer has given me a benefit of hope.

Shu_ashu · ‎03-19-2018

@Kok Ching Hoo Even you don't need to use jolt transform processor to get only the entry array as the flow file content. We can achieve the same result by using split json processor in more easy way. Configure split json processor as JsonPath Expression $.*.*.*.* By using above json path expression it doesn't matter even if header value changed,array entry has been changed to entry1,exit ... etc until you are having same structure of the json message(same dependency will be applicable by using jolt also), this method will work we are going to split the array it self and then use the splits relation to connect to the next processors. If you want to do dynamically without any dependencies on the attribute names that are going to be defined in the incoming json message/object then go with this approach.

gdeleon · ‎03-22-2018

I have communicated to the development team to investigate the possibility of adding server.startup.web.timeout=120 in /etc/ambari-server/conf/ambari.properties file. Thank you!

Shu_ashu · ‎03-17-2018

@Pavan M As you are not transferring any of the flowfiles to REL_FAILURE, Transfer the else flowfiles to failure relation and auto terminate the failure relation, else: session.transfer(flowFile, REL_FAILURE) Auto terminate failure relation (or) You can use session.remove to remove the flowfile else: session.remove(flowFile) by using any of the above ways you can achieve same result as you are expecting.

chaimajandoubi5 · ‎03-27-2018

@Shu Hello 😄 from the other side XD Am so grateful sir , you helped me out !! Thanks a lot :D I highly recommend you

RahulSoni · ‎03-13-2018

I guess you need to drop these expressions one at a time. Using multiple ReplaceText processors. For example for the first pattern, you can use the replace text as follows. Similarly, you can replace your patterns in the "Search Value" text box with following expressions. (?s)(\\\"\[) (?s)(\\) Hope that helps!

usama_kaleem88 · ‎03-28-2019

@Matt Burgess It works fine if there is just one object in the input tree if there are more it makes them as an array rather than separate records. Like { "agent_submit_time" : [ -1, -1 ], "agent_end_time" : [ 123445, 123445 ], "agent_name" : [ "Marie Bayer-Smith", "Marie Bayer-Smith" ] } I would like to to be something like [ { "agent_submit_time" : -1, "agent_end_time" : 123445, "agent_name" : "Marie Bayer-Smith" }, { "agent_submit_time" : -1, "agent_end_time" : 123445, "agent_name" : "Marie Bayer-Smith" } ] How to do that. I tried but I couldnt replaceing "*": "&" with "@": "[&]" makes it separate but the transformation of - to _ doesnt takes place.

RahulSoni · ‎03-07-2018

You may want to look at this answer.

Online	Offline
Last Visited	‎10-08-2020 11:27 AM

Member Since	‎08-03-2019 10:44 AM
Last Visited	‎10-08-2020 11:27 AM
Posts	186
Kudos received	33

Cloudera Community

Re: Hive / HBase migration - Different clusters

Re: Flowfiles are stuck in que/connection of Nifi

Re: Save dataframe with header in spark 1.6

Re: hive external table pointing to AVRO files

Re: sqoop 1.4.6.2.6.3.0-235 import failing

Re: JSON split and then converted to CSV before be...

Re: NIFI job schedule

Re: Nifi job scheduling

Re: Removing text before and after [ ] characters

Re: unable login the dashboard of the HDF sandbox

Re: Delete/Stop flow files in execute script

Re: What is the best way to ingest data from HDFS ...

Re: NIFI Replace Text issue

Re: How to do JOLT replace on all JSON keys in Nif...

Re: NiFi Will QueryDatabaseTable reset the Maximum...