Member since
05-02-2016
154
Posts
54
Kudos Received
14
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4205 | 07-24-2018 06:34 PM | |
5802 | 09-28-2017 01:53 PM | |
1439 | 02-22-2017 05:18 PM | |
14405 | 01-13-2017 10:07 PM | |
3982 | 12-15-2016 06:00 AM |
02-06-2017
03:36 AM
1 Kudo
Try
df.withColumn("id", when($"deviceFlag"===1, concat($"device", lit("#"), $"domain")).otherwise($"device"));
... View more
02-03-2017
04:56 PM
3 Kudos
I was looking for a way to easily forward and analyze provenance data that is available in nifi. There were a couple of options available.
You could use the nifi-rest api to search for provenance data and then use the results for analysis or storing in a database. The other alternative was to setup a Site2Siteprovenancereportingtask in nifi. This will forward provenance events to a flow in nifi. option 1 is a very techy option , you could point you UI directly to the rest api and present a nice provenance visual with bulk replay capabilities. But, then it makes the developer responsible for keeping up with changes in the nifi rest api. It would be nice if we did not have direct dependency. Also, you might want to lockdown the rest api in production. option 2 is very easy , but it is limited in where i can send those provenance events. The apache nifi eng team resolved this situation with a ScriptedReportingTask controller service. It gives you an easy way of setting up the provenance reporting in Nifi and forwarding it to an end point of your choice. You also do not have a direct dependency between your application and nifi rest api. You can use ScriptedReportingTask to massage the events into a format that works with you application/endpoint. I chose groovy as the language for my script, but there is options for python,javascript and a few others. once you are logged in to nifi . Click the menu on the top right corner. Select controller settings option. On the Controller setting dialog, choose the Reporting Tasks tab. Click the + on the top right corner to create a new reporting task. On the Add reporting task dialog, search for ScriptedReportingTask. Double click on ScriptedReportingTask option in the results or select the row and click Add. You will see a new ScriptedReportingTask in the reporting tasks list. Click on the pencil icon , to edit the reporting task. You will see a reporting task window. Select groovy as the Script Engine choice and paste the script below in Script Body. Make sure to change the location of the file where your events will be written to. import groovy.json.*;
import org.apache.nifi.components.state.StateManager;
import org.apache.nifi.reporting.ReportingContext;
import org.apache.nifi.reporting.EventAccess;
import org.apache.nifi.provenance.ProvenanceEventRepository;
import org.apache.nifi.provenance.ProvenanceEventRecord;
import org.apache.nifi.provenance.ProvenanceEventType;
final StateManager stateManager = context.getStateManager();
final EventAccess access = context.getEventAccess();
final ProvenanceEventRepository provenance = access.getProvenanceRepository();
log.info("starting event id: " + Long.toString(1));
final List<ProvenanceEventRecord> events = provenance.getEvents(1, 100);
log.info("ending event id: " + events.size());
def outFile = new File("/tmp/provenance.txt");
outFile.withWriter('UTF-8') { writer -> events.each{event -> writer.writeLine(new JsonBuilder(event).toPrettyString()) }}
Click ok and apply. Click on the "Play " Button to active the reporting task. I had set the scheduling frequency for the task on mine to 10 secs, so i could see the results right away. You can set it to a higher value as needed. You should the logs appear in /tmp/provenance.txt , in json format. you could use other formats if needed and also may be not prettify for better performance. The ScriptReportingTask is repsponsible for the ReportingContext , which is available to your scripts as the context object. You can log information to the nidi-log using the ComponentLog log object, which is also passed to you by the reporting task. If you need anyother variables to be set in from the nifi task, you can define them as dynamic properties. My script is very simple, it will look at 100 provenance events from the first provenance event. You can use the statemanager to keep track of the last provenance event that you received. You look at the implementation by @jfrazee to see how we can incrementally collect provenance events. https://github.com/jfrazee/nifi-provenance-reporting-bundle Thank you to @Matt Burgess for putting together this very useful reportintask component. Hope this is useful.
... View more
Labels:
01-26-2017
05:18 PM
yeah , agree with that.
... View more
01-26-2017
06:52 AM
java.net.SocketTimeoutException, is port 2181 open. Are you running nifi and the phoenix server on the same machine?
... View more
01-26-2017
04:40 AM
3 Kudos
you can use the updateAttribute processor. for the grade attribute, use the EL ${grade:replaceNull("nograde")}.
... View more
01-19-2017
03:03 PM
How will you find out all flowfiles from a source have been processed? If you can get that figured out nifi has several components ex. put email, which can send an email to given recipients. You can also set up an SNMP agent, use setSNMP to set a message , which the snmp agent can forward to recipients.
... View more
01-14-2017
01:35 AM
check the security group on AWS, make sure there is a incoming rule for port 8080.
... View more
01-13-2017
10:27 PM
yeah, i would try a bigger instance. To quickly do this, go into cons/bootstrap.conf and may be reduce the Xms and Xmx to link 128m and see what happens. It could be possible that OS is not able to allocate 512m to nifi.
... View more
01-13-2017
10:07 PM
I am seeing that nifi is trying to start and is getting killed for some reason, don't see any logs that indicate an error. Are you being patient enough? There is a lot of packages that get unpacked and setup when you run nifi for the first time. Also the entropy can sometimes block, so may try giving it some more time, around 15 minutes and see if it comes up. I am just hoping you are not terminating it too early. If that is not the case, what kind of ec2 are you using, how much memory, disk space and cores?
... View more
01-13-2017
08:37 PM
@Ranjit S That is strange.... can you show us your directory structure, a screenshot of it. Also make sure you downloaded the full tar and that it was not corrupted. I should see something in nifi-bootstrap.log at least.
... View more