Member since
06-08-2017
1049
Posts
518
Kudos Received
312
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 11124 | 04-15-2020 05:01 PM | |
| 7026 | 10-15-2019 08:12 PM | |
| 3068 | 10-12-2019 08:29 PM | |
| 11254 | 09-21-2019 10:04 AM | |
| 4190 | 09-19-2019 07:11 AM |
07-07-2018
02:36 AM
@ShuThank you very much
... View more
07-05-2018
12:33 PM
@Markus Wilhelm I don't think we can make NiFi to read kerberos configs to read by default but you can make use of Process group variables in your HDFS processor configs and define the variables scope as NiFi Flow so that you can use same variables across all the processors in NiFi instance. You can copy hdfs-site.xml,core-site.xml to nifi lib path and restart nifi, then you don't have to specify the path because nifi will load all the .xml from lib path, but it's not recommended way of approach because if you want to change some configs in either of these two xml files then we need to restart NiFi to take those changes in to effect in NiFi instance. Refer to this link regarding Process Group variables in NiFi and refer to this link regarding copying xml files into nifi lib.
... View more
05-29-2019
03:18 PM
How to split complexed json arrays into individual json objects with SplitJson processor in NIFI? I don't know how to configure the relationship original, split, failure. Json arrays is below { "scrollId1": "xyz", "data": [ { "id": "app-server-dev-glacier", "uuid": "a0733c21-6044-11e9-9129-9b2681a9a063", "name": "app-server-dev-glacier", "type": "archiveStorage", "provider": "aws", "region": "ap-southeast-1", "account": "164110977718" }, { "id": "abc.company.archive.mboi", "uuid": "95100b11-6044-11e9-977a-f5446bd21d81", "name": "abc.company.archive.mboi", "type": "archiveStorage", "provider": "aws", "region": "us-east-1", "account": "852631421774" } ] } I need to split it into { "id": "app-server-dev-glacier", "uuid": "a0733c21-6044-11e9-9129-9b2681a9a063", "name": "app-server-dev-glacier", "type": "archiveStorage", "provider": "aws", "region": "ap-southeast-1", "account": "164110977718" }, { "id": "abc.company.archive.mboi", "uuid": "95100b11-6044-11e9-977a-f5446bd21d81", "name": "abc.company.archive.mboi", "type": "archiveStorage", "provider": "aws", "region": "us-east-1", "account": "852631421774" } Next, I need to insert another field "time" in front of "id", the first attribute of individual object. I used SplitJson processor, and JSON Path Expression is $.data.id.*, but the relationship reports error. Don't know how to config relationship branches, original, split and failure. Any one have any advice? @Shu
... View more
07-09-2018
01:22 PM
1 Kudo
@Derek Calderon - Short answer is no. The ExecuteSQL processor is written to write the output to the FlowFile's content. - There is an alternative solution. You have some processor currently feeding FlowFiles to your ExecuteSQL processor via a connection. My suggestion would be to feed that same connection to two different paths. The first connection feeds to a "MergeContent" processor via a funnel and the second feeds to your "ExecuteSQL" processor. The ExecuteSQL processor performs the query and retrieves the data you are looking for writing it to the content of the FlowFile. You then use a processor like "ExtractText" to extract that FlowFIles new content to FlowFile Attributes. Finally you use a processor like "ModifyBytes" to remove all content of this FlowFile. Finally you feed this processor to the same funnel as the other path. The MergeContent processor could then merge these two flowfiles using the "Correlation Attribute Name" property (assuming "filename" is unique, that could be used), min/max entries set to 2, and "Attribute Strategy" set to "Keep All Unique Attributes". The result should be what you are looking for. - Flow would look something like following: Having multiple identical connections does not trigger NiFi to write the 200 mb of content twice to the the content repository. a new FlowFile is created but it points to the sam content claim. New content is only generated when the executeSQL is run against one of the FlowFiles. So this flow does not produce any additional write load on the content repo other then when the executeSQL writes its output which i am assuming is relatively small? - Thank you, Matt
... View more
07-04-2018
08:41 PM
@Vengai Magan Please refer to this and this links describes how to install NiFi as Service and this link to setup high performance NiFi.
... View more
06-27-2018
02:25 PM
Perfect! Thanks! I'll try QueryDatabaseTable for it. It'll be better!
... View more
06-27-2018
12:03 PM
1 Kudo
@Vladislav Shcherbakov Before ReplaceText processor use EvaluateJsonPath processor to extract the json values, keep as flowfile attributes. Add all your properties(case sensitive) in this processor and keep the destination as flowfile-attribute then feed the success relationship from EvaluateJsonpath to Replace text processor. Flow: --- --- other processors
3.SplitJson 5.EvaluateJsonPath
6.ReplaceText
... View more
06-27-2018
03:56 AM
@Raj ji You can use ExecuteProcess (doesn't allow any incoming connections) (or) ExecuteStreamCommand processors to trigger the shell script. ExecuteProcess configs: As your executable script is on Machine 4 and NiFi installed on Machine1 so create a shell script on Machine 1 which ssh into Machine 4 and trigger your Python Script. Refer to this and this links describes how to use username/password while doing ssh to remote machine. As you are going to store the logs into a file, so you can use Tail file processor to tail the log file and check is there any ERROR/WARN, by using RouteText Processor then trigger mail. (or) Fetch the application id (or) application name of the process and then use yarn rest api to get the status of the job Please refer to how to monitor yarn applications using NiFi and Starting Spark jobs directly via YARN REST API and this link describes yarn rest api capabilities.
... View more
06-29-2018
09:29 AM
@Murat Menteşe As your xml doc having array [] in it and i'm not sure how to write matching xslt. As the current xslt converts the array xml into object/element and adding "" for array[]. In case of large data you have to increase the Maximum Buffer Size 1 MB //increase based on your flowfile size as this processor works takes the whole flowfile into memory and does all replacing based on our configs.
... View more
06-29-2018
09:18 AM
@Amira khalifa Use one of the way from the above shared link to take out only the header from the csv file then in replace text keep the then search for (&|\(|\)|\/_|\s) and in Replacement value keep as empty string, now we are searching for all the special characters in the header flowfile then replacing with empty string.Now add this header flowfile with the other non header flowfile. all the explanation and template.xml are shared in this link.
... View more