Member since
11-16-2015
892
Posts
649
Kudos Received
245
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5217 | 02-22-2024 12:38 PM | |
1337 | 02-02-2023 07:07 AM | |
3004 | 12-07-2021 09:19 AM | |
4155 | 03-20-2020 12:34 PM | |
13952 | 01-27-2020 07:57 AM |
09-01-2016
06:23 PM
1 Kudo
If you click on any of the examples,it should fill in the Input and Spec boxes.
... View more
09-01-2016
06:21 PM
1 Kudo
After your FetchSFTP, the bar-delimited content will be in the content of the flow file, not the attributes. That is followed by an AttributesToJson processor which will overwrite the flow file content with a JSON document containing attributes such as sftp.remote.host, sftp.remote.port, etc. (see the doc for AttributesToJson). I think you may want a SplitText processor after your FetchSFTP processor, to create one flow file for each line in your file. Then you could have an ExtractText processor which could use a regex (with grouping) to create attributes such as column.1, column.2, etc. Then your ReplaceText can use those attributes.
... View more
09-01-2016
06:15 PM
1 Kudo
That class is part of the write-ahead log stuff in nifi-commons. I think there was a recent update to its structure. Now that Apache NiFi 1.0.0 has been released, you should use that instead of the 1.0.0-BETA (which i see in your logs is the version you're on)
... View more
09-01-2016
02:06 PM
4 Kudos
The output from one processor to another is a flow file, which consists of a map of attributes (key/value pairs) and a payload of bytes as the flow file content. The content could be raw binary data (an image, for example) or a text file in any format (JSON, XML, CSV to name a few). The content and attributes of a flow file are manipulated by the processors in different ways, the documentation for each processor will describe what attributes it reads and/or writes, as well as what operations it may perform on the processor. For example, the UpdateAttribute processor allows you to add attributes to (or delete them from) incoming flow files. Another example is SplitJson, which expects incoming flow files to have a JSON object as the flow file content, and then you configure the processor with a JSONPath expression pointing at an array within the object. Then the processor will split the original JSON object into individual JSON objects and send out flow files for each element in the array to the "split" relationship. It also sends the original incoming flow file to the "original" relationship. You can add connections between processors for the relationship(s) defined by the source processor. I encourage you to read the Overview and Getting Started guides for more information on the concepts of NiFi (Flow Files, Processors, Connections, etc.). If you are looking for working examples, there is a set of templates available on the NiFi Wiki.
... View more
08-31-2016
12:42 PM
The FetchElasticsearch processor uses the native transport, whose default port is 9300. In Nifi 0.7.0 (and in the upcoming HDF 2.0), there is a FetchElasticsearchHttp (and PutElasticsearchHttp) which uses the REST API (whose default port is 9200).
... View more
08-27-2016
02:40 PM
1 Kudo
Although the script engine reports its name as "python", it is actually Jython, which can only use pure Python modules, not native modules like numpy/scipy. If this is needed, consider ExecuteProcess or (if you have incoming flow files) ExecuteStreamCommand which can execute the command-line python.
... View more
08-25-2016
06:44 PM
2 Kudos
To add to Scott's answer, you can use QueryDatabaseTable (for a one-time export, if you choose a "maximum value column" like the primary key column) into a ConvertAvroToORC processor (available in the 1.0 GA release), then a PutHDFS processor to get the data into Hadoop. If the table has not been created, ConvertAvroToORC generates partial Hive DDL in an attribute (CREATE TABLE IF NOT EXISTS...), so after PutHDFS you could have a ReplaceText processor to put the DDL statement (along with the file's HDFS location) into the flow file, then send that to a PutHiveQL processor, which would execute the DDL statement, creating the table atop the directory containing your file(s) in HDFS. That might sound a bit complicated, but it is flexible and powerful. I will post a template to the NiFi wiki after 1.0 is released, showing how such a flow would work.
... View more
08-24-2016
05:32 PM
1 Kudo
The JSON object has to be "flat", meaning all your column fields have to be at the top level of the JSON object. Yours are nested inside a "Table" field. You can use EvaluateJsonPath with a path of $.Table to get the flat JSON object.
... View more
08-19-2016
12:37 PM
1 Kudo
Are you running NiFi on Windows? If so, you'll need winutils.exe and to add it to the HADOOP_HOME environment variable as described here: http://stackoverflow.com/questions/33048363/issue-with-hivetopology-from-storm-hive
... View more
08-18-2016
11:50 PM
1 Kudo
You probably have null values in your table and are encountering this bug: https://issues.apache.org/jira/browse/NIFI-2602. It will be fixed for NiFi 1.0 / HDF 2.0.
... View more