Member since
06-08-2017
1049
Posts
514
Kudos Received
312
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
8437 | 04-15-2020 05:01 PM | |
4959 | 10-15-2019 08:12 PM | |
1884 | 10-12-2019 08:29 PM | |
8165 | 09-21-2019 10:04 AM | |
2820 | 09-19-2019 07:11 AM |
06-27-2018
03:56 AM
@Raj ji You can use ExecuteProcess (doesn't allow any incoming connections) (or) ExecuteStreamCommand processors to trigger the shell script. ExecuteProcess configs: As your executable script is on Machine 4 and NiFi installed on Machine1 so create a shell script on Machine 1 which ssh into Machine 4 and trigger your Python Script. Refer to this and this links describes how to use username/password while doing ssh to remote machine. As you are going to store the logs into a file, so you can use Tail file processor to tail the log file and check is there any ERROR/WARN, by using RouteText Processor then trigger mail. (or) Fetch the application id (or) application name of the process and then use yarn rest api to get the status of the job Please refer to how to monitor yarn applications using NiFi and Starting Spark jobs directly via YARN REST API and this link describes yarn rest api capabilities.
... View more
06-27-2018
03:12 AM
@Cody kamat While running hive import target-dir argument value controls where the data needs to store temporarily before loading into Hive table, but target-dir doesn't create hive table in that location. If you want to import to specific directory then use target-dir without hive-import argument and create hive table on top of HDFS directory. (or) Create Hive external table pointing to your target-dir then in sqoop import remove --create-hive-table argument and --target-dir. For more info refer to this HCC thread regarding the same issue.
... View more
06-27-2018
03:00 AM
@Amira khalifa As suggested by @anarasimham With Start of the string(^) in Replace text processor should match only the first line. Make sure you are having matching regex that exclude special character. Example: in the above configs i'm matching only the first line in the flowfile and adding new to the first line only and all the other contents will be untouched. input: hi hello Output: newhi hello (or) You can use one of the way that i have suggested in this link, please refer to the shared link and choose the method that will best fit for your case. If you are still having issues please share some sample data with header and the expected output?
... View more
06-26-2018
01:09 PM
@Marco Springer There is similar Hcc thread regarding list processors in NiFi, please refer to this link for more details. Let us know if you have additional questions..!!
... View more
06-26-2018
03:00 AM
Even though you are having reduced json still you can use MergeRecord processor to merge single json messages into an array of json messages by using MergeRecord processor with JsonTreeReader/JsonSetWriter controller services, Configure Min/Max number of records per flowfile and use Max Bin Age property as wildcard to eligible bin to merge. Then feed the Merged Relationship to PutHBaseRecord processor(give the row identifier field name from your json message) as the purpose of Record oriented processor is to work with Chunks of data to get good performance instead of working with one record at a time.
... View more
06-26-2018
01:36 AM
@Murat Menteşe You can use ReplaceText processor after TransformXml processor then add the matching regex excluding "(quotes) before/after array []. ReplaceText Configs:- Search Value
(.*)"(\[.*\])"(.*)
Replacement Value
$1$2$3
Character Set
UTF-8
Maximum Buffer Size
1 MB //increase the size according to your flowfile size
Replacement Strategy
Regex Replace
Evaluation Mode
Entire text Input:- { "soap:Envelope": { "soap:Body": { "Musteri_Hiyerarsi_TablosuResponse": { "Musteri_Hiyerarsi_TablosuResult": "[ { "UNIQ_KEY": 740281.0, "TTALT": 112.0, "TTAD": "TEST" } ]" } } } } Output: valid json { "soap:Envelope": { "soap:Body": { "Musteri_Hiyerarsi_TablosuResponse": { "Musteri_Hiyerarsi_TablosuResult": [ { "UNIQ_KEY": 740281.0, "TTALT": 112.0, "TTAD": "TEST" } ] } } } }
... View more
06-25-2018
10:09 PM
1 Kudo
@Ferrero Rocher Use InvokeHTTP processor which allows incoming connections instead of GetHTTP processor Flow:- 1.GetHTTP 2.SplitJson //split array $.* 3.EvaluateJsonPath //to extract id value and keep as state attribute 4.InvokeHTTP //http://${DOMAIN}/api/states/${state}/municipalities
... View more
06-25-2018
12:25 PM
3 Kudos
@Faisal Durrani Use Record oriented processor PutHbaseRecord instead of PutHbaseJson. PutHBaseRecord processor works with chunks of data based on the Record Reader(Json Tree Reader) specified and you can send array of json messages/records to the processor, based on the record reader controller service processor reads and put the json messages/records into HBase. Adjust the batch size as you can get good performance Batch Size 1000 The maximum number of records to be sent to HBase at any one time from the record set. Refer to this link to configure Record Reader controller service.
... View more
06-24-2018
02:20 PM
1 Kudo
@Gourav Bhattacharya GetFTP processor once triggered then it will get all the files from the directory based on your confiurations. To control the rate of Fetching files use ListFtp processor(lists 0 byte flowfiles) instead of GetFtp processor then use Control rate processor to control the rate of flowfiles processing through the Control Rate Processor then feed the Success relation to FetchFTP processor. Flow: 1.ListFTP
2.ControlRate //control the rate of flowfiles
3.FetchFTP in addition refer to this link to get the data and process one flowfile at a time without using Control Rate Processor.
... View more
06-21-2018
02:11 PM
1 Kudo
@rajat puchnanda Method 1: You can use Query record processor and add new dynamic property with value as sql query like select id,count(*) from flowfile group by id and the processor will run the sql query on the flowfile content then gives the result as output flowfile. Refer to this link to get more details regarding configuration of Query Record processor. Method2: If you are thinking to Group by all like records then you can use Partition Record processor and specify the record path that you want to group by then the processor will groups all the like records into inidividual groups. Refer to this link to get more details regarding configuration of Partition Record processor.
... View more