Member since
09-29-2015
31
Posts
34
Kudos Received
18
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
787 | 08-07-2018 11:16 PM | |
2306 | 03-14-2018 02:56 PM | |
1379 | 06-15-2017 10:13 PM | |
7859 | 06-05-2017 01:40 PM | |
3552 | 05-17-2017 02:52 PM |
08-07-2018
11:16 PM
As a result of the the HDF build versioning and how the NiFi extension manager handles versions there is, unfortunately, one additional NAR that is needed. You should also provide the nifi-standard-services-api-nar-1.5.0.3.1.2.0-7.nar that coincides with nifi-aws-service-api-nar-1.5.0.3.1.2.0-7.nar. Without this, I would imagine that in your log you will see warnings that it could not find the needed standard services api NAR. This additional requirement is a biproduct of the HDF specific builds.
... View more
03-14-2018
02:56 PM
Hi @Akananda Singhania, I suspect your network configuration on your Docker Engine host is incorrect. Running the image you listed works as anticipated in a few of the environments available to me. Let's try to confirm this suspicion by running the following: docker run busybox ping -c 1 files.grouplens.org
You should receive output similar to the following. If not, the configured DNS server is not appropriately routing to external sites. PING files.grouplens.org (128.101.34.235): 56 data bytes
64 bytes from 128.101.34.235: seq=0 ttl=37 time=39.263 ms
--- files.grouplens.org ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 39.263/39.263/39.263 ms Could you provide more details about your environment in which you are running Docker? Of interest would be the output of cat /etc/resolv.conf Another option is to try explicitly specifying a DNS server such as those that Google makes available via a command such as: docker run --dns 8.8.8.8 -d -p 8080:8080 apache/nifi
... View more
06-16-2017
04:15 PM
Definitely thought I had the link on there before. Specifically, you can find it here: https://github.com/apache/nifi-minifi/blob/master/minifi-docs/src/main/markdown/System_Admin_Guide.md#flowstatus-script-query
... View more
06-15-2017
11:08 PM
Are you receiving errors on the bulletins for HandleHttpRequest or Response? Could you specify how the desired file is being mapped from your request to the Fetch File processor? Configuration of your processors or a template would be most helpful. Thanks!
... View more
06-15-2017
10:13 PM
The best option to inquire about the state of the flow would be to make use of the FlowStatus querying functionality. You can use this to interpret the flow. This is roughly analogous to the statistics available in the NiFi UI. Considerations and talk to build upon this, especially under the context of C2, are definitely important ideas for handling operational ease and understanding of how instances are behaving. Hopefully the flow status querying is helpful in the interim until a more feature rich mechanism is in place. Issuing a manual flushing of the queue is something that is not likely to be possible so things like backpressure and expiration periods become very important for connections to help mitigate against such issues.
... View more
06-05-2017
01:40 PM
3 Kudos
You would create a dynamic attribute to select the key of interest. Consider the following sample doc: {
"key-1": "value 1",
"key-2": "value 2",
"key-3": "value 3"
}
In this case, if we needed to route based on the value of key-2, we could create a property in the processor that is $.key-2 and assign this to some name, such as "routing.value". With that bit of information extracted as an attribute, we can feed the flowfiles from the EvaluateJsonPath processor. If the locations you mention are file locations, we could potentially simply use a PutFile with expression language to specify a path making use of the routing.value by defining Directory to use the attribute with something like "/path/to/my/data/${routing.value}" A more powerful and flexible case would be where each flowfile gets sent to a RouteOnAttribute processor after EvaluateJsonPath. In this case, we could define routes for each of the cases and allow them to be sent on to other NiFi components. For instance, maybe some things go to disk and others go to JMS. We can create relationships for RouteOnAttribute that will then allow us to connect each type to its respective processing path.
... View more
05-17-2017
02:52 PM
@Anthony Murphy This should be fine if you have the appropriate volumes mapping to the specified directories. In this case, you should have three separate Docker volumes mapping your host-based shared location to the three directories in question. This will allow the Docker daemon to write to the external mappings and free it from the container. You would then invoke a new container with these same mappings and it will pick up where things left off. If this is how you are attempting things, if you could please comment with your run command, we can certainly debug why things might be coming up short, but running a quick trial on my system, it looks like things are behaving as anticipated.
... View more
05-16-2017
01:44 PM
2 Kudos
There is currently no ability to render a process group at a higher level or in a different view than its nested state. However, with the release of 1.2.0 it is now possible to access a process group (or other component) directly via deep linking. From this standpoint, once I enter a process group, that will provide me a URL such as http://localhost:8080/nifi/?processGroupId=11774962-015c-1000-6e20-6d87498f7427&componentIds= which will allow me to enter directly into the group with the associated id. There has been some mention in the past within the community about having different workspaces (different groups could use a shared instance but have an independent canvas from their view) which I believe gets to the experience you are looking for but there has not been any tangible design or implementation to work toward that at this point in time.
... View more
03-02-2017
02:47 PM
Hi @omer alvi, You are getting an illegal character in the query which I am assuming is the | (pipe) character. You may need to url encode your url. Luckily, you can achieve this with NiFi Expression Language. Of note is the urlEncode function, with docs available at https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html#urlencode.
... View more
03-01-2017
02:27 PM
2 Kudos
Evaluate the MergeContent processor: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.MergeContent/index.html.
... View more
02-24-2017
03:21 PM
2 Kudos
Templates are owned by a process group (whether that is the root process group or one nested in the canvas). You can upload templates by making use of the '/process-groups/{id}/templates/upload' to upload a template to a particular process group.
... View more
01-23-2017
07:19 PM
1 Kudo
Yes, this dynamic will not work in the current scenario. There is some work under way and has been proposed to help in these scenarios. You can read about that here: https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows As mentioned prior, the only way to accomplish what you are looking for is to perform a docker commit and use that image as the new point of reference any time you want to capture the current state of the instance but would include the totality of the running instance. There are some alternative storage drivers that allow volumes to be shared which are also covered at the previously linked Docker documentation, such as Flocker. https://docs.docker.com/engine/tutorials/dockervolumes/#/mount-a-shared-storage-volume-as-a-data-volume
... View more
01-23-2017
03:15 PM
Keep in mind that a Docker image is immutable and when any changes are made, this is both a new layer and image. To save the templates in your current approach, you would need to save the Docker image via a docker commit. Although this would not be maintainable, nor would it be best practice. Instead, you would likely want to make use of Docker volumes (https://docs.docker.com/engine/tutorials/dockervolumes/) such that the data could be persistent on the host. In this case, you could use a folder on the host that contains your seed templates that would be mapped into running instances of the image you created. Something to the effect of:
docker run -it -v /<host templates dir>/:/opt/nifi/conf/templates/ my_image This will allow you to provide templates to your container and also capture those new ones introduced in the image on the host system. This set of cached templates on your host will also enable you to then mount these templates to multiple instances that would be colocated on that host.
... View more
12-21-2016
11:33 PM
3 Kudos
The following is under the assumption you do not explicitly require using Site to Site to transmit the data. While there are facilities that could support this, it is not an extension point and currently does not provide control over how FlowFiles are delivered. The simplest and most naive approach would be to use RouteOnAttribute to route each FlowFile to a given relationship and then use that to feed your transmission processor of choice. In this case, a PostHTTP sending to a ListenHTTP would be one way of attack that would allow transmission formatted as FlowFiles. Depending on the source system you might, such as if it was clustered, need to use expression language to additionally mark the destination system and use that to dynamically craft the resultant POST URL to the associated listener. This is fairly static and simple but would cover the use case you were anticipating.
... View more
12-12-2016
04:34 PM
1 Kudo
There is not such a mechanism in place for ExecuteStreamCommand. ExecuteProcess does have the ability to batch but based on your mention of incoming event rate, seems like the need for handling input which ExecuteProcess does not provide. Depending on the nature of your parsing/process, it may be possible to convert this to use InvokeScriptedProcessor which could tie said parser/process to the component lifecycle.
... View more
07-15-2016
02:09 PM
2 Kudos
Your best strategy for this case would be to leverage Expression Language [1] in the Remote URL field of your InvokeHTTP. You could then use a GetFile to take your set of URLs from a file on the filesystem or even an InovkeHTTP to another location to introduce your list of URLs scheduling strategy of Cron to pull in that list as desired. This list could then be split into separate events using a SplitText. From here, we can promote the contents of each of these splits to an attribute (perhaps target.url) using ExtractText. This would then be passed to your InvokeHTTP which would make use of the target.url attribute by specifying ${target.url} in the "Remote URL" field mentioned above. [1]https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html
... View more
07-13-2016
09:15 PM
1 Kudo
With MergeContent, it is possible to specify a Max Bin Age that will prevent a data starvation condition that prevents the latest data from being held in limbo. Accordingly, you can make a best effort to get an appropriately sized file to place in HDFS but not at the cost of data being held indefinitely.
... View more
07-10-2016
06:19 PM
2 Kudos
Repo Description Provides a Dockerfile and associated scripts for configuring an instance of Apache NiFi to run with certificate authentication. Repo Info Github Repo URL https://github.com/apiri/dockerfile-apache-nifi Github account name apiri Repo name dockerfile-apache-nifi
... View more
- Find more articles tagged with:
- apache-nifi
- Data Ingestion & Streaming
- NiFi
- utilities
Labels:
06-23-2016
02:04 PM
1 Kudo
If you are referring to a batch or PowerShell script, this can be accomplished via ExecuteProcess [1] or ExecuteStreamCommand [2]. Both will allow you to execute processes on your host environment and collect its output to stdout from the execution in a resultant FlowFile for further processing. ExecuteStreamCommand additionally allows you to take an incoming FlowFile and pipe its content across stdin to the associated process.
[1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.6.0/org.apache.nifi.processors.standard.ExecuteProcess/
[2] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteStreamCommand/index.html
... View more
05-14-2016
03:40 AM
1 Kudo
NiFi cannot currently do this with the provided *FTP* implementations. However, the commons-net library that is used for FTP also provides an FTPSClient as well that would be quite similar in structure as well as providing the needed certificates (likely through an SSLContextControllerService) and related properties.
... View more
02-21-2016
08:32 PM
1 Kudo
InvokeHTTP makes use of the "mime.type" attribute. Depending on the version you are using, this may or may not be exposed as a property "Content-Type" which defaults to the expression language "${mime.type}"
... View more
01-27-2016
05:04 AM
3 Kudos
Clustering is bidirectional in that there is a heartbeat mechanism as you are seeing across the protocol port you have configured and then the NCM replicating requests and communicates to the nodes via their REST API, configured across the same host and port that the UI for each would be available on. This information is relayed from each node to the manager as part of the joining process. The primary properties beyond those for the clustering protocol are: nifi.web.http.host nifi.web.http.port which would need to be accessible from the NCM. With the request that results in the "... No nodes were able to process this request", there should additionally be a stacktrace on the NCM that outputs the address(es) that it is anticipating to be available. Verify connectivity to those sockets from your NCM. If nifi.web.http.host is not explicitly set, this will default to localhost which then be interpreted by the manager incorrectly when transmitted with the heartbeat. Beyond that, if that does not turn up any additional paths, if you are able to share your NCM and one of your node's web and clustering properties it may help to debug a bit further.
... View more
01-13-2016
11:16 PM
@surender nath reddy kudumula There are several other processors that handle various types of data formats and protocols and always more in development as we continue to grow the product and community. Some common items include UDP and Syslog listeners. I think there are certainly some additional items that we can add into NiFi to facilitate a use case such as streaming HTTP and actually started creating a ticket to provide some functionalities to at least cover general cases (NIFI-1389) Feel free to add any suggestions toward that functionality. Thanks!
... View more
01-13-2016
10:54 PM
1 Kudo
Surender: The default NiFi does not currently have any processors that map directly to streamed HTTP data sources. As a means of getting the data into a NiFi flow, you could also consider https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteProcess/index.html processor. Configuring an instance of this processor, with the properties: Command curl Command Arguments -H "Auth: suri:dba37513923299cbb5bcbff766bacd3d" https://stream.datasift.com/fb409968ceacb8e588bb82de95c59958 Batch Duration 1s will give you provide a mechanism to bring this streamed content into the flow, batched into flowfiles at one second intervals. This gives a nice proof of concept of how you could interact with the data source before diving into a custom processor. However, this is inexact, and as a result, some results may get truncated depending on time boundaries. A custom processor might be nice to handle the data format to be cognizant of event boundaries.
... View more
01-13-2016
12:14 AM
1 Kudo
Is this for the GetHTTP? If so, yes, EL would be the best path forward to create unique files via the Filename property. Alternatively, you can use an UpdateAttribute processor to update the filename attribute to a new name in the flow if there is additional context or knowledge of the file that helps in that process. Regarding the SSL issues, could you provide more information as to what is not working? Would like to ensure we get you on the right track here or address any bugs that may be lurking behind the scenes for that process. Thanks!
... View more
01-12-2016
07:51 PM
6 Kudos
You will need to create and configure an SSLContextService for the processor to use so that it can establish trust with the certificate being presented by the DataSift service. curl works because it is tying into the default system truststore for you. To provide a similar experience as curl on the command line, you will need to configure the truststore properties for your SSL Context Service instance with: Truststore Filename: the cacerts file from your Java installation If $JAVA_HOME Is set on your system, it should help point you in the right direction. If not, the location of cacerts varies depending on environment, but is approximately the following for their respective OS OS X: /Library/Java/JavaVirtualMachines/jdk<version>.jdk/Contents/Home/jre/lib/security/cacerts Windows: C:\Program Files\Java\jdk<version>\jre\lib\security\cacerts Linux: /usr/lib/jvm/java-<version>/jre/lib/security/cacerts
-- You can additionally use $(readlink -f $(which java)) Truststore Type: JKS
Truststore Password: The default password of "changeit" if you are using the default Java keystore
When this controller service is created and enabled, the associated GetHTTP will need to be updated to reference it.
... View more
10-13-2015
06:00 PM
They are not currently. We had evaluated this before, but generally available nightly builds and their ilk are not allowed according to guidelines [1]. Setting this up for a strictly developer capacity provided in "limited" dissemination to the developer mailing list seems to be okay but is not something we had pursued further at the time with Apache infrastructure. [1] http://www.apache.org/dev/release.html
... View more
10-12-2015
06:53 PM
1 Kudo
I've worked with NiFi in Azure a bit and there is nothing overly foreign in this environment in comparison to others. Obviously, a Docker image will make this even easier, but even out of the box, it is quite straightforward. With one particular environment, there were a few items to iron out with networking when setting up clustering and site-to-site, but I feel this was likely due to their unique configuration and resources.
... View more