Member since
07-22-2016
28
Posts
5
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5786 | 10-26-2018 01:03 AM | |
4543 | 11-08-2017 01:05 AM | |
1774 | 11-02-2017 10:44 PM |
10-26-2018
01:03 AM
Hi, I had the same issue and after i created the SSLContextService, i had to change the property in the InvokeHttp "Always Output Response" to true and this will give you an output, in the output look for the invokehttp.remote.dn, since is a 403 error "Forbidden" it means that the dn does not have access to make this request but your SSLContextService is working. Next step is to add the Identity that will make the https request(invokehttp.remote.dn) in NiFi User UI and run again the InvokeHTTP. Hope this helps
... View more
06-12-2018
11:52 PM
Can you post you json file content ? Test your json path definition at http://jsonpath.com/? , is easy and debug any issues with the json format.
... View more
05-08-2018
11:10 PM
1 Kudo
Hi, You want to extract those entire 280 millions in a single call ? not a good idea ! I suggest you loop thru it using the table partition + bucket keys(for better query response on Hive). What is you table construct ? Based on this you can run SelectHiveQL in parallel or sequentially thru a loop. A once extract won`t do. Thx
... View more
04-28-2018
03:43 AM
Hi, Your bottle neck is the flowfile repo configuration parameters. Things to look for: 1 - is your repos sharing the same disk volume ? I have for each major repo their own disk so i wont have to fight for IO and they are not on the same disk as the nifi install. 2 - what is the value of the nifi.queue.swap.threshold parameter, default is 20000 after this your NiFi will swap even with RAM available Eg: if your max queue would be 4000000 lines make sure you have this set in the Note: You JVM memory settings in the bootstrap.conf have to copmply with the volume allowed in your nifi.queue.swap.threshold, so if you have multiple data flows with 400 mb + what ever else you need to sum this to fit in your JVM Heap 3 - what is the Concurrent Tasks value on the SplitText by 10 ? (make sure this is 25% of Split by 1) eg:2 4 - what is the Concurrent Tasks value on the SplitText by 1 ? eg:8 I do a similar data flow and i can do 5000000 rows in 40 sec. This does not consider Kafka Push. i hope this helps
... View more
04-26-2018
11:55 PM
Hi, So is the getfile & splitfile processors taking 5 min ? What is the use pattern of the data after the file is split. Also it took me a secound to get a 400 mb split it in 5 files(split by lines number)
... View more
02-27-2018
04:49 AM
Hi, I manage to get to the bottom of it at https://stackoverflow.com/questions/48985818/apache-nifi-executestreamcommand-wrong-output , @daggett he asked to set Ignore STDIN to true This fixed my issue. How ever this was not very self explanatory becouse i don`t know how and why would the output be changed?!
... View more
02-26-2018
10:05 AM
I have NiFi flow that runs some shell scripts using the ExecuteStreamCommand processor and the output of the ExecuteStreamCommand is not correct. The Shell i run is: if (( $(ps -ef | grep -v grep | grep kibana | wc -l) > 0 )); then echo "1"; else echo "0"; fi; if the service is up then 1 if is down then 0, simple but the output is wrong, not matter is the service is up or down the output is always 1. Here is a demo if the flow: https://youtu.be/4e00rzerjSQ
... View more
Labels:
- Labels:
-
Apache NiFi
02-01-2018
09:50 PM
Are you able to ping the S3 or access the S3 endpoint from you NiFi server ? If your NiFi server is on private subnet with no EIP attached you need to create an VPC endpoint to your S3 service If this you case follow this link :https://aws.amazon.com/blogs/aws/new-vpc-endpoint-for-amazon-s3/
... View more
11-23-2017
06:04 AM
Just my Two Cents on it, If you are experiencing a lot of queued flow-files and you want to clear them without having to reload the config.yml, a very quick and dirty way would be to clear all the repos on the minifi edge. Here how i do it - assuming your minifi sits in /opt - you will lose all queued data by doing this, but you no longer have queues.(not ideal) -- before cleanup
/opt/minifi/bin/minifi.sh flowStatus connection:all:health,stats | tail -2 | head -1| jq '.'
/opt/minifi/bin/minifi.sh stop
rm -rf /opt/minifi/flowfile_repository/*
rm -rf /opt/minifi/content_repository/*
rm -rf /opt/minifi/provenance_repository/*
/opt/minifi/bin/minifi.sh start
-- after cleanup
/opt/minifi/bin/minifi.sh flowStatus connection:all:health,stats | tail -2 | head -1| jq '.'
Ideal solution: Use the minifi.sh flowStatus cmd to push data into the NiFi server and monitor this types of stuff. From here you can develop nifi flows to act on what needs to be done to those edge minifi. you can even build a dashboard from this data and actually see how to edges are doing. No doubt Horton has this on the to do list.
... View more
11-09-2017
12:39 AM
yeah i know is not very intuitive 🙂
... View more
11-08-2017
01:05 AM
1 Kudo
Hi, You almost got it right 🙂 In the Replacement Value: type "Shift-Enter" -- this will add a new line as replacement value. the search value will remove all empty lines and prefix spaces. Also i have attached a sample template : replace.xml
... View more
11-07-2017
10:16 PM
Thanks @Hellmar Becker
... View more
11-05-2017
11:15 PM
Hi, Very interesting article. In what day to day scenario would this be useful ? Thx
... View more
11-02-2017
10:44 PM
1 Kudo
Hi, Use This - this lists the resources in the server and formats the templates ids into a api call. note: jq -- formats the output to be readable curl -v -X GET http://NiFiServerIP:Port/nifi-api/resources | jq '.' | grep '"/templates/' | awk '{print $2}' | sed 's/\"//g'| sed 's/\/templates//g' | awk '{print "curl -v -X GET http://NiFiServerIP:Port/nifi-api/templates"$1"/download"}' The output should be a Rest Api call: curl -v -X GET http://NiFiServerIP:Port/nifi-api/templates/885dc02e-cb13-429c-bdvd-9a4e0cbe1212b/download In the future if you don`t see how is done in the docs you can always SPY on the nifi-app.log, it gives you what you need to know. Hope this helped.
... View more
11-02-2017
10:27 PM
1 Kudo
Use port 587 - it work for me. See link here: https://support.google.com/a/answer/176600?hl=en Note: Before you start the configuration, make sure that Less secure apps is enabled for the desired account.
... View more
10-24-2017
10:23 PM
Why don`t you use a nifi-app_${now():format('yyyy-MM-dd')}_* Where the output will be nifi-app_2017-10-25_*. You can also use attributes for you log prefix and then you route on attribute based on the log type.
... View more
10-20-2017
12:00 AM
Hi, Is there a way to query the H2 Database in Apache NiFi ? Scope: 1 - i want to be able to list all Queues in a Process Group. 2 - i want to be able to associate Process Group/Processors with their IDs and Given names. This way i can build a dynamic mechanism to better integrate and administer the NiFi work. Thx
... View more
Labels:
- Labels:
-
Apache NiFi
09-27-2017
11:08 PM
Hi, I have the following JSON {
"ARTGEntryJsonResult": {
"AnnualChargeExemptWaverDate": null
"Conditions": [
""
],
"ConsumerInformation": {
"DocumentLink": ""
},
"EntryType": "Medicine",
"LicenceClass": "",
"LicenceId": "152567"
},
"Products": [
{
"AdditionalInformation": [],
"Components": [
{
"DosageForm": "Drug delivery system, transdermal",
"RouteOfAdministration": "Transdermal",
"VisualIdentification": "Dull, homogenous"
}
],
"Containers": [
{
"Closure": "",
"Conditions": [
"Store at room temperature"
],
"LifeTime": "2 Years",
"Material": null,
"Temperature": "Store below 25 degrees Celsius",
"Type": "Sachet"
}
],
"EffectiveDate": "2017-09-18",
"GMDNCode": "",
"GMDNTerm": "",
"Ingredients": [
{
"Name": "Fentanyl",
"Strength": "6.3000 mg"
}
],
"Name": "FENTANYL SANDOZ ",
"Packs": [
{
"PoisonSchedule": "(S8) Controlled Drug",
"Size": "1"
},
{
"PoisonSchedule": "(S8) Controlled Drug",
"Size": "10"
},
{
"PoisonSchedule": "(S8) Controlled Drug",
"Size": "2"
},
{
"PoisonSchedule": "(S8) Controlled Drug",
"Size": "3"
},
{
"PoisonSchedule": "(S8) Controlled Drug",
"Size": "4"
},
{
"PoisonSchedule": "(S8) Controlled Drug",
"Size": "5"
},
{
"PoisonSchedule": "(S8) Controlled Drug",
"Size": "7"
},
{
"PoisonSchedule": "(S8) Controlled Drug",
"Size": "8"
}
],
"SpecificIndications": [
"Management of chronic pain requiring opioid analgesia."
],
"StandardIndications": [],
"Type": "Single Medicine Product",
"Warnings": []
}
],
}
}
I am able to extract it using the splitJson + EvaluateJsonPath but i get the Array Fields Data in a single row. Example of what i managed to get considering this JSON file: LicenceId Name PoisonSchedule Size 152567 FENTANYL SANDOZ "(S8) Controlled Drug","(S8) Controlled Drug","(S8) Controlled Drug" 1,10,2 What i actually want is : LicenceId Name PoisonSchedule Size 152567 FENTANYL SANDOZ (S8) Controlled Drug 1 152567 FENTANYL SANDOZ (S8) Controlled Drug 10 152567 FENTANYL SANDOZ (S8) Controlled Drug 2 Any ideas are much appreciated !!! Thanks
... View more
Labels:
- Labels:
-
Apache NiFi
09-26-2017
11:32 PM
You don`t need to Extract the text . Your flow should look like this GenerateTableFetch --> ExecuteSQL -->ConvertAvroToJson(optional) --> StoreResults Where Your GenerateTableFetch conf would be: And the ExecuteSQL is: You would get a Json/Avro Output Format: Then you can work on you Json file to extract it
... View more
08-08-2017
06:49 AM
1 Kudo
Ok , I need to give you a big hug !!! I was going out of my mind with this !!! Could not imagine why was not working ! thank you so much ! you made my day 🙂
... View more
08-07-2017
11:27 PM
Hi all, I am getting my arse kicked by the EvaluateJsonPath. So the task is to be able to extract some json attribute values into a CSV format or a text format that will be used for inserting into file, db ,etc.. So here is what i got so far: GetFile (it reads a json file) --> SplitJson --> EvaluateJsonPath --> PutFile (i will throw a merge in between after i get after the EvaluateJsonPath to work "understand how it work"). The json used is: - this is just a dummy json data {
"results": [
{
"gender": "female",
"name": {
"title": "ms",
"first": "juliette",
"last": "schmitt"
},
"location": {
"street": "7260 esplanade du 9 novembre 1989",
"city": "le havre",
"state": "lot",
"postcode": 58491
},
"email": "juliette.schmitt@example.com",
"login": {
"username": "smallcat938",
"password": "horizon",
"salt": "fvaSIrep",
"md5": "3d1a182db3002dc88cd54b258838aa89",
"sha1": "71600ad7a6f65ec00fed7521bcf3d943a87a8172",
"sha256": "cd9bfc086b5b83a935c99b496c27fc58716521a43c2630e286f87147e4b4dd8a"
},
"dob": "1959-08-16 23:11:26",
"registered": "2010-07-16 20:27:45",
"phone": "04-09-99-35-16",
"cell": "06-87-17-31-45",
"id": {
"name": "INSEE",
"value": "259755098491 56"
},
"picture": {
"large": "https://randomuser.me/api/portraits/women/71.jpg",
"medium": "https://randomuser.me/api/portraits/med/women/71.jpg",
"thumbnail": "https://randomuser.me/api/portraits/thumb/women/71.jpg"
},
"nat": "FR"
}
],
"info": {
"seed": "ca021f036c48503a",
"results": 1,
"page": 1,
"version": "1.1"
}
}
1- GetFile setup: Getfile 2- SplitJson setup: SplitJson 3- EvaluateJsonPath EvaluateJsonPath Note: this one is working! Now adding a new property to the EvaluateJsonPath gets me stuck ErrorJsonPaths The documentation is not very illustrative on how this should be used and i am not an json expert. I am trying to understand how the does the twitter template actually does it ? twitt.png -where multiple paths are declared. thanks all
... View more
Labels:
- Labels:
-
Apache NiFi