About SAMSAL

SAMSAL · ‎12-11-2024

Hi @PeterC , I dont have experience creating python ext. processor under linux system but I would imagine it will work the same as under windows. I noticed however in python command you are using python9.3 , can you switch this to just python ? You should not be needing to manually creating the folder structure under ./work/python and all dependencies should be downloaded for your.

SAMSAL · ‎12-11-2024

Hi @Sid17 , Welcome to the community. Since you are new let me recommend couple of tips for future posts that will help you get better attention and response time from the community: 1- When posting json format data or any other format please make sure the format is valid. Im saying this because neither the input nor the expected output is in valid format. 2- When posting code or data in certain format please try to use the Insert/Edit Code Sample with the symbol "</>" from the tool bar. If you dont see it just expand the tool bar by clicking the (...). This will help your code\data standout by applying proper formatting and color coding - as you will see below- depending on the code\format you select from the drop down. Regarding your jolt problem , I can see from the spec that you are still learning. Jolt can get very confusing and you will get better as you practice. I would recommend you to first explore some tutorial to you help you at least understand the basics like: https://community.cloudera.com/t5/Community-Articles/Jolt-quick-reference-for-Nifi-Jolt-Processors/ta-p/244350 https://docs.digibee.com/documentation/components/tools/transformer-jolt https://github.com/bazaarvoice/jolt As for the spec, you can try the following: [ { "operation": "shift", "spec": { "data": { "getBusinessRelationListing": { "edges": { "*": { "node": { "countries": { "*": { "@(2,code)": "[&4].[&1].BusinessRelation", "@(2,id)": "[&4].[&1].Id", "@(2,name)": "[&4].[&1].BusinessRelationDisplayName", "locationCountry": "[&4].[&1].&", "parent": { "name": "[&5].[&2].LocationCountryName", "shortCode": "[&5].[&2].LocationRegion" } } } } } } } } } }, { "operation": "shift", "spec": { "*": { "*": { "@": "[]" } } } } ] Notice the spec uses two shift transformations: 1- Map all the fields accordingly and group according to their array & sub array position from the input. 2- get all objects from the transformation above and bucket into new array. Also, since you are beginner in the Json Transformation. I would recommend you to take a look at another json transformation language called JSLT which Nifi has processor for (version 127.0+). JSTL is very powerful transformation language as well and it works similar to XQuery if you are familiar with that. I like to use it in these situation because its easier to traverse nested structure with and flatten it, therefore it will be much less line of codes and more readable than jolt as in the following: let X = [for (.data.getBusinessRelationListing.edges) let node = (.node) [for (.node.countries) { "BusinessRelation":$node.code, "Id":$node.id, "BusinessRelationDisplayName":$node.name, "locationCountry": .locationCountry, "LocationCountryName":.parent.name, "LocationRegion":.parent.shortCode } ] ] flatten($X) For more information regarding JST, please refer to: https://github.com/schibsted/jslt/tree/master https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-jslt-nar/1.27.0/org.apache.nifi.processors.jslt.JSLTTransformJSON/index.html Hope that helps. if it does, please accept the solution. Thanks

SAMSAL · ‎12-10-2024

Hi @FreddyM , It seems like the default spec doesnt like it when the root object is an array. The good news is that there are other ways to assign default values , for example in the shift spec using # on the left hand side allows you to set default as follows: [ { "operation": "shift", "spec": { "#file.xml": "fileName", "#2": "count", "*": { "@": "individuals[]" } } } ] If you have the fileName and the count values stored in flow file attribute you can reference them using expression language as in : "#${fileNameAttribute}":"fileName" Hope that helps. Please accept solution if it does. Thanks

SAMSAL · ‎12-05-2024

I apologize but this so confusing to me. Im having hard time making heads from tail here. Also it seems you are trying to change status from disabled to ready (enabled) I'm not sure you can do that even with the API , once the processor is disabled it cant be changed until manually enabled. I dont think I can help you if I dont understand your problem statement very well but I hope someone else can. My advise is to re write your problem statement so that its clear to follow begging to end and then articulate what the problem is and how you are trying to solve, if its too long or to complex then try to come up simplified scenario that would isolate the problem you are trying to solve.

SAMSAL · ‎12-05-2024

Couple of things I noticed: 1- You dont need the Wait processor as I said it doesnt apply in this case. Wait & Notify processor work together please review info about this here. You can remove the Wait processor and replace with the ControlRate processor. 2- In the Control Rate set the Max Rate = 1 . This will allow only one flowfile entry per specified Time Duration. The first flowfile passed after adding this processor or changing its setting will go through without waiting but any subsequent flowfiles will wait. Each passed flowfile will reset the time so the next flowfile will wait the full Time Duration. 3- Why do you have the InvokeHttp disabled?! If its disabled then nothing will run coming from the ControlRate processor.

SAMSAL · ‎12-05-2024

Hi @bgumis , Can you please provide more info on your flow like screenshot or description of the steps and where do you expect the wait to happen? The wait processor is not intended for this scenario. Can you provide more information on why the ControlRate is not working for you wither through the configuration or through where it should be placed in the dataflow? Thanks

SAMSAL · ‎12-04-2024

Hi @DuyChan , Have you tried using DistributedMapCacheClientService & DistributedMapCacheClientServer instead. Im not sure what is the difference with the MapCacheClientService but it should do the same job. Also be aware because of size limitation Nifi is not being published with all packages and in case you dont find services or processors that should be part of nifi , you probably need to download jar\nar pacakge from maven repositories and save to the Nifi Lib Folder: https://mvnrepository.com/artifact/org.apache.nifi/nifi-hazelcast-services-api-nar/2.0.0 https://mvnrepository.com/artifact/org.apache.nifi/nifi-distributed-cache-client-service-api If that helps please accept the solution. Thanks

SAMSAL · ‎12-03-2024

Hi @SS_Jin , Glad to hear that my post helped. Its really hard to suggest something specially when I dont have all the details of what you are trying to do but from what I read, I think Join\Fork Enrichment would work better in these scenarios. The mergerecord way could be problematic when you are reading multiple sources and multiple CSVs where merge behavior can be unpredictable. Also depending on what type of enrichment you are trying to do and how complex its but if you have one to one mapping between record in the DB vs CSV and you are trying to override some data or add new one then you might also consider the LookupRecord processor to simplify your data flow where you dont have to use branching to read and then merge the different sources which might endup saving you some overhead. https://community.cloudera.com/t5/Community-Articles/Data-flow-enrichment-with-NiFi-part-1-LookupRecord-processor/ta-p/246940

SAMSAL · ‎12-03-2024

Hi @Mikhai , Its hard to say what is going on without looking at the data itself or seeing the ExcelReader Configuration. I know providing the data is not easy but if you can replicate the issue using dummy data then please share. Also if you can provide more details on how you configured the ExcelReader, for example are you using custom schema or infering the schema? I would try the following: 1- Try to find table boundary in excel and delete empty rows. If you cant then for sake of testing copy the table with the rows you need into new excel and see if that works. 2- If ExcelReader works with 545 rows , then I will try and provide custom schema - if not provided - and try to set some of the fields where there should be a value to not allow null. Maybe by doing so it will help the ExcelReader not to import empty rows. I tried to use ExcelReader before but ran into issues when the excel has some formula columns because of a bug in the reader itself. Im not sure if those issues were addressed but as workaround I used Python Extension to develop custom processor that takes excel input and convert into Json using Pandas library. This might be an option to consider if you are still having problems with the ExcelReader service but you have to use Nifi 2.0 version in order to use python extension. If that helps please accept the solution, Thanks

SAMSAL · ‎12-03-2024

Hi @Emery , Unfortunately no I have not been able to do it and if you are using windows docker desktop I don't think it can be done. One way around it is to use Nginx Reverse proxy but it's not easy process to follow and I wasn't able to implement either. If you are ever able to get it working please do share your findings.

Online	Offline
Last Visited	‎12-31-2024 03:55 PM

Member Since	‎07-29-2020 02:31 PM
Last Visited	‎12-31-2024 03:55 PM
Posts	574
Kudos received	320

Cloudera Community

Re: CSVReader and CSVRecordSetWriter doesn't consi...

Re: Jolt spec to flatten the nested JSON

Re: CSVReader and CSVRecordSetWriter doesn't consi...

Re: Converting Nested JSON to Flat JSON using JOLT

Re: NIfi: javax.security.auth.login.LoginExceptio...

Re: Nifi 2.0 'Processor' is invalid because initia...

Re: How to convert Nested JSON to Flattened JSON u...

Re: NiFi JoltTransformJSON 2.0.0 trasform JSON flo...

Re: Fileflow penalized for certain time before all...

Re: Fileflow penalized for certain time before all...

Re: Fileflow penalized for certain time before all...

Re: How to use Notify and Wait in Apache Nifi vers...

Re: Processor work JoinEnrichment

Re: ExcelReader Exception

Re: Multi Host Nifi Cluster Deployment using Docke...