Member since
08-24-2020
15
Posts
0
Kudos Received
0
Solutions
05-20-2021
02:33 PM
@P_Rat98 Can you share how you have the ListS3 processor configured? What can you tell us about the file being listed? Is it constantly being updated? Thanks, Matt
... View more
11-02-2020
05:15 AM
@P_Rat98 Per our PM discussion, in your flow use DetectDuplicate before sending an email. This should rate limit the # of messages you send based on your configuration of detectDuplicate. Additionally when this is linked in your flow, and duplicates are auto-terminated it will drain the flow and stop it from filling up the queue. Also as suggested you can chose to retain the duplicates, but move them into a much bigger Queue which isnt going to back up the main flow. Then once you see the email, you can go look at flow, see what flowfiles were causing issues, and take some corrective action. If you really need to monitor flow for a queue being full, you would need to use the nifi API to check the status of the queue. This maybe more work than it is worth, when you can solve as above much easier. However, i would recommend you check the api out, there are a lot of api capabilities and I am beginning to use nifi api calls within my flow to monitor, stop, start, and take actions automatically that would normally require a human doing them in the UI. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
10-19-2020
01:21 PM
The solution you are looking for is: ReplaceText: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.12.0/org.apache.nifi.processors.standard.ReplaceText/ You can find loads of examples here in the forum with this search: https://community.cloudera.com/t5/forums/searchpage/tab/message?advanced=false&allow_punctuation=false&q=replaceText If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven
... View more
09-04-2020
06:25 AM
@P_Rat98 You need parquet tools to read parquet files from command line. There is no method to view parquet in nifi. https://pypi.org/project/parquet-tools/
... View more
08-28-2020
08:52 AM
@P_Rat98 The error above is saying there is an issue with the Schema Name in your record reader or writer. When inside the properties for Convert Record, click the --> arrow through to the reader/writer and make sure they are configured correctly. You will need to provide the correct schema name (if it is already an existing attribute) or provide the schema text. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven @ DFHZ
... View more
08-28-2020
08:47 AM
@P_Rat98 You need to set the filename (Object Key) of each parquet file uniquely to save different S3 files. If that processor is configure to just ${filename} then it will over write additional executions. For the second option, if you have split in your data flow, the split parts should have key/value pair for the split and total splits. Inspect your queue and list attributes on split flowfiles for these attributes. You use these attributes with MergeContent to remerge everything back together into a single flowfile. You need to do this before converting to parquet, not after. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven @ DFHZ
... View more
08-27-2020
06:08 AM
1 Kudo
@P_Rat98 Creating an API with NiFi using HandleHttpRequest and HandleHttpResponse is something I have done quite a few times for Hortonworks and NiFI Customers. This is a great use case for NiFi and sending receiving JSON to NiFi, processing JSON, and completing actions downstream is super easy. I have created a basic template for you which includes the HandleHttpRequest (inbound port 80 call) a process group for doing something with the JSON, and HandleHttpResponse (provides 200 response code) to respond to inbound call. This is an API in the simplest form with NiFi. Depending on your use case you can build out Process Api Request Process Group to suit your needs. Out of the box you should be able to send to import template, add/start the StandHttpContextMap Controller Service, Start the flow, send a call to http://yourhost:80 and have JSON sitting in the bottom of the flow Success Queue. You can find the template here: https://github.com/steven-matison/NiFi-Templates/blob/master/NiFi_API_with_HandleHttpRequest_Demo.xml Some API suggestions: Be sure to take a look at both HandleHttp Processors for the properties you can configure. Ports, hostname, acceptable methods, ssl, authentication, and more. If your API call does not care if the Process API Request finishes, you can put HandleHttpResponse right after HandleHttpRequest, and let all the downstream work happen after the request/response is completed. This is common when I expect my API to be only giving inbound data, and doesn't care what the response is (other than just 200 to know it was received). In this case I accept the payload, return 200, and rest of the flow is decoupled from the connection. If my processing time is lengthy I usually do this so the system initiating the api call is not left waiting. Once you have the basic framework built, consider handling errors, and or returning different status codes as a variable (created before the response) in the Status Code for Handle Http Response. Sometimes I even have different HandleHttpResponse at end of different flow branches. For example: if someone sends invalid JSON, I return maybe 302 or 404 with the invalid error as the content body. Have fun with it. If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post. Thanks, Steven @ DFHZ
... View more