Member since
07-22-2016
28
Posts
5
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6223 | 10-26-2018 01:03 AM | |
4923 | 11-08-2017 01:05 AM | |
1914 | 11-02-2017 10:44 PM |
10-26-2018
01:03 AM
Hi, I had the same issue and after i created the SSLContextService, i had to change the property in the InvokeHttp "Always Output Response" to true and this will give you an output, in the output look for the invokehttp.remote.dn, since is a 403 error "Forbidden" it means that the dn does not have access to make this request but your SSLContextService is working. Next step is to add the Identity that will make the https request(invokehttp.remote.dn) in NiFi User UI and run again the InvokeHTTP. Hope this helps
... View more
06-12-2018
11:52 PM
Can you post you json file content ? Test your json path definition at http://jsonpath.com/? , is easy and debug any issues with the json format.
... View more
05-08-2018
11:10 PM
1 Kudo
Hi, You want to extract those entire 280 millions in a single call ? not a good idea ! I suggest you loop thru it using the table partition + bucket keys(for better query response on Hive). What is you table construct ? Based on this you can run SelectHiveQL in parallel or sequentially thru a loop. A once extract won`t do. Thx
... View more
04-28-2018
03:43 AM
Hi, Your bottle neck is the flowfile repo configuration parameters. Things to look for: 1 - is your repos sharing the same disk volume ? I have for each major repo their own disk so i wont have to fight for IO and they are not on the same disk as the nifi install. 2 - what is the value of the nifi.queue.swap.threshold parameter, default is 20000 after this your NiFi will swap even with RAM available Eg: if your max queue would be 4000000 lines make sure you have this set in the Note: You JVM memory settings in the bootstrap.conf have to copmply with the volume allowed in your nifi.queue.swap.threshold, so if you have multiple data flows with 400 mb + what ever else you need to sum this to fit in your JVM Heap 3 - what is the Concurrent Tasks value on the SplitText by 10 ? (make sure this is 25% of Split by 1) eg:2 4 - what is the Concurrent Tasks value on the SplitText by 1 ? eg:8 I do a similar data flow and i can do 5000000 rows in 40 sec. This does not consider Kafka Push. i hope this helps
... View more
04-26-2018
11:55 PM
Hi, So is the getfile & splitfile processors taking 5 min ? What is the use pattern of the data after the file is split. Also it took me a secound to get a 400 mb split it in 5 files(split by lines number)
... View more
02-27-2018
04:49 AM
Hi, I manage to get to the bottom of it at https://stackoverflow.com/questions/48985818/apache-nifi-executestreamcommand-wrong-output , @daggett he asked to set Ignore STDIN to true This fixed my issue. How ever this was not very self explanatory becouse i don`t know how and why would the output be changed?!
... View more
02-26-2018
10:05 AM
I have NiFi flow that runs some shell scripts using the ExecuteStreamCommand processor and the output of the ExecuteStreamCommand is not correct. The Shell i run is: if (( $(ps -ef | grep -v grep | grep kibana | wc -l) > 0 )); then echo "1"; else echo "0"; fi; if the service is up then 1 if is down then 0, simple but the output is wrong, not matter is the service is up or down the output is always 1. Here is a demo if the flow: https://youtu.be/4e00rzerjSQ
... View more
Labels:
- Labels:
-
Apache NiFi
02-01-2018
09:50 PM
Are you able to ping the S3 or access the S3 endpoint from you NiFi server ? If your NiFi server is on private subnet with no EIP attached you need to create an VPC endpoint to your S3 service If this you case follow this link :https://aws.amazon.com/blogs/aws/new-vpc-endpoint-for-amazon-s3/
... View more
11-23-2017
06:04 AM
Just my Two Cents on it, If you are experiencing a lot of queued flow-files and you want to clear them without having to reload the config.yml, a very quick and dirty way would be to clear all the repos on the minifi edge. Here how i do it - assuming your minifi sits in /opt - you will lose all queued data by doing this, but you no longer have queues.(not ideal) -- before cleanup
/opt/minifi/bin/minifi.sh flowStatus connection:all:health,stats | tail -2 | head -1| jq '.'
/opt/minifi/bin/minifi.sh stop
rm -rf /opt/minifi/flowfile_repository/*
rm -rf /opt/minifi/content_repository/*
rm -rf /opt/minifi/provenance_repository/*
/opt/minifi/bin/minifi.sh start
-- after cleanup
/opt/minifi/bin/minifi.sh flowStatus connection:all:health,stats | tail -2 | head -1| jq '.'
Ideal solution: Use the minifi.sh flowStatus cmd to push data into the NiFi server and monitor this types of stuff. From here you can develop nifi flows to act on what needs to be done to those edge minifi. you can even build a dashboard from this data and actually see how to edges are doing. No doubt Horton has this on the to do list.
... View more
11-09-2017
12:39 AM
yeah i know is not very intuitive 🙂
... View more