Member since
07-29-2020
501
Posts
233
Kudos Received
148
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
111 | 07-23-2024 02:34 AM | |
182 | 07-20-2024 05:51 PM | |
276 | 07-12-2024 03:05 AM | |
318 | 07-11-2024 03:44 AM | |
179 | 07-11-2024 02:57 AM |
07-23-2024
02:34 AM
1 Kudo
Hi @SushmaNaya , I believe this was asked before here but I dont think there is a way to do it in one shot. you probably have to do it in custom script like executeScript processor where you can have part of the data stored as an attributes (depending how big the part is ) and then the other part is your flowfile content or have both stored as attributes (not recommended). Once you get the flowfile the execute script will get the content of the flowfile and the other aprt form the attribute construct your multi part body and send it that way. By the way if you are using Nifi 2.0 you probably can utilize python extensions to create your own custom processor using pure python when you can use different packages to help you with that. You can find example here: https://community.cloudera.com/t5/Support-Questions/Nifi-1-18-multipart-form-data-with-binary-part-and-json-part/m-p/386351#M246014 https://www.w3schools.com/python/ref_requests_post.asp Hope that helps
... View more
07-20-2024
05:51 PM
3 Kudos
Hi, Another option is to use FreeFormTextRecrodSetWriter for that. The documentation is lacking about this unfortunately but you can find some examples like this if you google it. All you need is a ConvertRecord processor to get the desired result. here is an example : - GenerateFlowFile is to simulate generating the csv input: - ConvertRecord: which takes the CSV input using CSVReader and Record Writer using the FreeFormTextRecordSetWriter: - CSVReader Service Configuration: You can use default configuration. -FreeFormTextRecordSetWriter: The Text used in the above Service to provide desired output: username = ${username}
first name = ${"first name"}
middle name = ${"middle name"}
last name = ${"last name"} Output: username = test_user
first name = test_FN
middle name = test_MN
last name = test_LN
username = test_user2
first name = test2_FN
middle name = test2_MN
last name = test2_LN Hope that helps.
... View more
07-15-2024
07:29 AM
1 Kudo
Hi, Can you elaborate more on the issue. For example are you getting an error? what is the status code of the response?I assume you are using InvokeHttp processor, Can you paste screenshot of the configuration you have ? What kind of API is it and are you sure this a GET API vs Others ? Have you tried to do the GET from pastman or curl ? was it successful? It seems using body with Get API is not good idea after all per below links , so if its something developed internally you probably should re consider. https://www.linkedin.com/pulse/using-body-http-get-method-still-bad-idea-danny-logsdon https://www.baeldung.com/cs/http-get-with-body
... View more
07-12-2024
10:18 AM
For jslt , like I said sometimes its easier to write and work with specially if you are trying to get\aggregate data from heavily nested structure. The other advantage jslt has its ability to traverse nested levels dynamically using recursive function calls for example lets say you are working with json structure but have no idea how many nested level it might have (see my json\jslt answer in the this post to get an idea what I mean by that ). On the other hand jolt appears to perform a little bit better in comparison but they are very close. In regards to your question you use the follow to extract single value for the path name: "@(2,values[0])": "paths.[&5].pathName"
... View more
07-12-2024
03:58 AM
1 Kudo
Hi, I dont think there is hostname property under ListenHttp and if there was at some point it probably got removed later on. Regarding the hostname under HandleHttpRequest, its basically to specify which node you would like to receive the request on in case of cluster. If you specify certain node it will only allow connection to that node only however if you leave it empty as the description says it will bind to all which mean you can send request using any node host name. Hope that helps.
... View more
07-12-2024
03:05 AM
2 Kudos
Hi, Actually I went against the main input since the second input (simplified version) is not valid json. Hopefully the following will work: [
{
"operation": "shift",
"spec": {
"description": "priceRuleName",
"startDatetime": "startDate",
"endDatetime": "enddate",
"locations": {
"0": {
"location": "storeLocation"
}
},
"customAttributes": {
"*": {
"id": {
"PR_PRICE_RULE_PROMO": {
"@(2,valueBoolean)": "priceRulePromo"
}
}
}
},
"tiers": {
"*": {
"priceRuleAttributes": {
"*": {
"id": {
"PR_BRANCH_NAME": {
"@(2,values)": "paths.[&5].pathName"
},
"PR_TARGET_SEGMENT": {
"@(2,values)": "paths.[&5].customerCondition[#2].targetSegments"
}
}
}
},
"rewards": "paths[&1].calculatePrice",
"conditions": {
"*": {
"conditionsMerch": {
"*": {
"item": {
"": {
"@(2,excludeInd)": {
//group categories under yes\no include ind
"0": {
"@(4,categoryId)": "paths[&9].catalogCondition.category.yes.categoryList[]"
},
"1": {
"@(4,categoryId)": "paths[&9].catalogCondition.category.no.categoryList[]"
}
}
}, //there is a valye in item -> products
"*": {
"@(2,excludeInd)": {
//group products under yes\no include ind
"0": {
"@(4,item)": "paths[&9].catalogCondition.products.yes.productList[]"
},
"1": {
"@(4,item)": "paths[&9].catalogCondition.products.no.productList[]"
}
}
}
}
}
}
}
}
}
}
}
}
,
{
"operation": "shift",
"spec": {
"*": "&",
"paths": {
"*": {
"*": "&2[&1].&",
"catalogCondition": {
"*": { //products or category
"*": { //bucket each yes\no group into its own array element
"*": "&5[&4].&3.&2[#2].&",
"$": "&5[&4].&3.&2[#2].include"
}
}
}
}
}
}
}
/**/
] Also something to consider is using JSLT. JSTL is another json transformation language similar to xquery in syntax ( check reference here) . There is jslt processor in nifi for that. In my openion jslt works better in these situation specially when you trying to query data from very nested structure using some complex logic to match on certain values . You probably have to write less lines in jslt and the logic is easier to follow than jolt (of course if you have get familiar with jslt syntax and how it works first to make sense of it) Here is how this transformation looks using jslt: import "http://jslt.schibsted.com/2018/experimental" as exp
let priceRulePromoArray = [for (.customAttributes) .valueBoolean if(.id=="PR_PRICE_RULE_PROMO")]
let mainInfo = {
"priceRuleName":.description,
"startDate":.startDatetime,
"enddate":.endDatetime,
"storeLocation":.locations[0].location,
"priceRulePromo":$priceRulePromoArray[0]
}
let paths= [for(.tiers)
{
"pathname":flatten([for(.priceRuleAttributes) .values
if(.id=="PR_BRANCH_NAME")]),
"customerCondition":[for(.priceRuleAttributes).values
if(.id=="PR_TARGET_SEGMENT")],
"calculatePrice": .rewards,
"catalogCondition":[for(.conditions)
let groupProductByInd =
exp:group-by([for (.conditionsMerch) . if(.item!="")]
,.excludeInd,
.item)
let groupCategoryByInd =
exp:group-by([for (.conditionsMerch) . if(.categoryId!="")]
,.excludeInd
, .categoryId)
let products = [for ($groupProductByInd )
{
"include": if(.key==0) "yes" else "no",
"productList":.values
}
]
let category = [for ($groupCategoryByInd )
{
"include": if(.key==0) "yes" else "no",
"categoryList":.values
}
]
if(any($products)) {"products":$products} else {} +
if(any($category )) {"category":$category} else {}
]
}
]
{"paths":$paths}+$mainInfo
... View more
07-11-2024
10:47 AM
Hi @Syed0000 , There is a lot going on in the json input and expected output that is hard to follow what you are trying to do . Can you shorten\simplify your json input by removing un necessary or or repeated info. It would be even better to come up with dummy json that focuses on the issue. That would help me in providing accurate resolution. Thanks
... View more
07-11-2024
03:44 AM
1 Kudo
Not sure why do you need to use the replace text processor in this case. I provided that in my sample above as an example anda a way to simulate getting new data by replacing the original content with something else. You can think of replaceText as if Im doing InvokeHttp and getting different flowfile content in the response releationship. If you got the data from module C just link directly to the join enrichment. As long as you have the correct writer\reader configured for each fork then you should be good.
... View more
07-11-2024
03:31 AM
1 Kudo
Hi, Not over complicating your scenario and assuming that you get a .trg file every once in a while where you dont have to worry about clashes or concurrency issue, I would solve this as follows 1- Use ListFile processor and points it to target directory. This processor needs to run on a schedule where its not continuously reading the same file while you are processing them. You have to figure out how much time between the different listing is enough to process the files in case of trg arrival. Also make sure to set the Record Writer property set so that you get an array of all the files in one flow file. there wont be tracking in this case (set listing strategy No Tracking) since we will be continuously reading the same files again and again in case no trg file has made it yet. The output of this processor is going to be an array of all the files found where each file object has the following properties (assuming we have json writer): {
"filename": "...",
"path": "....",
"directory": false,
"size": 256496,
"lastModified": 1707490322483,
"permissions": null,
"owner": null,
"group": null
} 2- Use QueryRecord by adding dynamic property with the following query: select * from flowfile where exists (
select 1 from flowfile where filename like '%.trg'
) This will produce the array list from above only if .trg file is found amongs them, otherwise nothing will happen and we will wait for the next listing from above. 3- If the condition above is met and trg file has made it , then use SplitRecord (or SplitJson in case you are using json writer ) to split each file object. 4- Use EvaluateJsonPath to get the filename and path for each file object. 5- Use FetchFile provided the attributes above to get the file and then do whatever needed. Make sure to set the completion strategy to move or delete the file so that you dont re process again. This is a very simplistic solution that might work in case like I said you get .trg file every once in a while where there is enough time to process each trg files batch. Also if you are not dealing with large number of files. If any of those conditions are not met , you definitly have to re consider. Another option that would work better , is to have two flows where one is continuously picking up whatever files come and place it in staging area and log it in the DB, so that when trg file arrives you invoke the other flow using nifi api to read and process whatever got logged in the DB. The DB table will have the staging area path for each logged file so you pass that to the FetchFile processor. This way you can manage clashes and concurrency issues better as well as you dont have to continuously keep listing all the files like above and query the dataset to look for trg files. The files already has been moved to the staing area and whenever trg arrives the list is read once and the files are processed. If find this helpful please accept the solution. Thanks
... View more
07-11-2024
02:57 AM
2 Kudos
It all depends on the complexity of the data you are working with. If you are taking about data transformation (converting to timestamp, replacing quote, ...etc.) then maybe groovy is the way to go. JSLT has some function that can you help you accomplish this as well like string replace and parse-time functons but Im not sure that is everything. Im not sure where did you get the impression that the nifi community doesnt recommend using groovey and if you find an article about that please share. I think its more of an issue with your Boss not wanting you to do any scripting to avoid not having this supported by others than you. The processor is there for you to use. Actually there is a dedicated processor for groovey called ExecuteGroovyScript. I think the ExecuteScript processor might get deprecated since its redundant. The only issue that I can find that warns about this processor is the fact the script is getting compiled for every flowfile and that might get expensive and impact the performance if you have a big script and working with large data volume. To avoid running into those scenarios, Nifi provides other alternative like InvokeScriptedProcessor (using groovey as well) or develop your custom processor in java (.nar) where the code is compiled once and done. The jslt processor also re compiles the script but it uses caching to avoid having to do that every time. In terms which performs better: groovey or jslt? Im not sure and I have never tested but you can do some stress testing and let us know :).
... View more