Member since
08-03-2019
186
Posts
34
Kudos Received
26
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2014 | 04-25-2018 08:37 PM | |
5948 | 04-01-2018 09:37 PM | |
1634 | 03-29-2018 05:15 PM | |
6853 | 03-27-2018 07:22 PM | |
2076 | 03-27-2018 06:14 PM |
03-24-2018
04:50 PM
@Mark McGowan You can do merging by using Correlation attribute name property in merge content processor, to use this property we cannot use convert record processor because each flowfile we need to keep date,hour attrbiutes as one attribute and use the attribute in merge content correlation attribute name. Then the merge content processor will merge with same attribute flowfiles into one flowfile. Please refer to below links for more details about correlation attribute usage https://community.hortonworks.com/questions/161827/mergeprocessor-nifi-using-the-correlation-attribut.html https://community.hortonworks.com/questions/55926/using-the-mergecontent-processor-in-nifi-can-i-use.html https://community.hortonworks.com/questions/87178/merge-fileflow-files-based-on-time-rather-than-siz.html For reference Convert record xml 178086-json-to-csv.xml
... View more
03-18-2018
02:08 PM
@Pramod Kalvala You can schedule multiple jobs at a single given time assuming you have the resource available to cater all of them 🙂 Coming to how can you do that, there are multiple scheduling options available in NiFi. If you right-click on your "Triggering processor", that is the very first processor in your job and click on "Configure", you will see a scheduling tab. You will see an interface as presented below. There you can see "Scheduling Strategy" drop down. There you can see two major scheduling options. Timer Driven Cron Driven Timer Driven strategy executes the processor according to the duration mentioned in the "Run schedule". So in this case, the processor will run every second, triggering the next processors in the flow, provided they don't have a schedule of there own. Cron Driven is the strategy where you can mention a specific time of a day, specific day(s) etc, i.e. the schedule when you can execute that processor. Let's say that you want to run your job @1 PM every day, the Scheduling tab will look something like You can have whatever number of jobs scheduled to run at 1 PM by using the same scheduling strategy for all of them and all of them will run at the same time. All of those jobs will be separate will not interfere with each other, until you instruct them to do so and will run without any issues, provided that you have sufficient resources for all of them.
... View more
03-18-2018
04:32 PM
@ Rahul soni - Thanks so much for the detailed explanation.we have a custom scheduler which decides the number of jobs to run and this triggers the oozie API, which triggers the particular workflow like sqoop or Kafka whatever.So this workflow will go get the properties for the respective workflow and run the jobs.We are looking forward to replace the sqoop workflow with nifi.I have been doing some POC’s on nifi lately and realized that it has the capability to handle the load and was not sure how it would work in scheduling jobs like sqoop.Anyway your answer has given me a benefit of hope.
... View more
03-19-2018
03:44 AM
@Kok Ching Hoo Even you don't need to use jolt transform processor to get only the entry array as the flow file content. We can achieve the same result by using split json processor in more easy way. Configure split json processor as JsonPath Expression $.*.*.*.* By using above json path expression it doesn't matter even if header value changed,array entry has been changed to entry1,exit ... etc until you are having same structure of the json message(same dependency will be applicable by using jolt also), this method will work we are going to split the array it self and then use the splits relation to connect to the next processors. If you want to do dynamically without any dependencies on the attribute names that are going to be defined in the incoming json message/object then go with this approach.
... View more
03-22-2018
06:35 PM
I have communicated to the development team to investigate the possibility of adding server.startup.web.timeout=120 in /etc/ambari-server/conf/ambari.properties file. Thank you!
... View more
03-17-2018
10:08 AM
1 Kudo
@Pavan M As you are not transferring any of the flowfiles to REL_FAILURE, Transfer the else flowfiles to failure relation and auto terminate the failure relation, else:
session.transfer(flowFile, REL_FAILURE) Auto terminate failure relation (or) You can use session.remove to remove the flowfile else: session.remove(flowFile) by using any of the above ways you can achieve same result as you are expecting.
... View more
03-27-2018
08:39 AM
@Shu Hello 😄 from the other side XD Am so grateful sir , you helped me out !!
Thanks a lot :D
I highly recommend you
... View more
03-13-2018
02:43 PM
I guess you need to drop these expressions one at a time. Using multiple ReplaceText processors. For example for the first pattern, you can use the replace text as follows. Similarly, you can replace your patterns in the "Search Value" text box with following expressions. (?s)(\\\"\[)
(?s)(\\) Hope that helps!
... View more
03-28-2019
02:01 PM
@Matt Burgess It works fine if there is just one object in the input tree if there are more it makes them as an array rather than separate records. Like {
"agent_submit_time" : [ -1, -1 ],
"agent_end_time" : [ 123445, 123445 ],
"agent_name" : [ "Marie Bayer-Smith", "Marie Bayer-Smith" ]
} I would like to to be something like [
{
"agent_submit_time" : -1,
"agent_end_time" : 123445,
"agent_name" : "Marie Bayer-Smith"
},
{
"agent_submit_time" : -1,
"agent_end_time" : 123445,
"agent_name" : "Marie Bayer-Smith"
}
] How to do that. I tried but I couldnt replaceing "*": "&" with "@": "[&]" makes it separate but the transformation of - to _ doesnt takes place.
... View more
03-07-2018
02:30 PM
You may want to look at this answer.
... View more
- « Previous
- Next »