Member since
04-29-2016
192
Posts
20
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1693 | 07-14-2017 05:01 PM | |
2890 | 06-28-2017 05:20 PM |
10-26-2016
06:26 PM
1 Kudo
You could use UpdateAttribute processor; it has "Delete Attributes Expression" property, where you can specify the Old attribute that you want to delete; would that work ?
... View more
10-20-2016
03:12 AM
@jfrazee Thank you; I'm going the SplitText route for now, it seems to work; for the purposes of saving the split files, for later reference, how do I assign different names (I'm thinking may be pre or postpend UUID to the file name) to the child/split flowfiles; when I looked at it, all of the child files are getting the same name as the parent flowfile, which is causing child flowfiles to be overwritten.
... View more
10-19-2016
08:19 PM
Hi All,
I have the following requirement:
Split a single NiFi flowfile into multiple flowfiles, eventually to insert the contents (after extracting the contents from the flowfile) of each of the flowfiles as a separate row in a Hive table.
Sample input flowfile:
MESSAGE_HEADER | A | B | C
LINE|1 | ABCD | 1234
LINE|2 | DEFG | 5678
LINE|3 | HIJK | 9012
.
.
.
Desired output files:
Flowfile 1:
MESSAGE_HEADER | A | B | C
LINE|1 | ABCD | 1234
Flowfile 2:
MESSAGE_HEADER | A | B | C
LINE|2 | DEFG | 5678
Flowfile 3:
MESSAGE_HEADER | A | B | C
LINE|3 | HIJK | 9012
.
.
.
The number of lines in the flowfile is not known ahead of time.
I would like to know what's the best way to accomplish this with the different NiFi processors that are available;
The splitting can be done at the flowfile level or after the contents of the flowfile are extracted out of the flowfile, but before Hive insert statements are created.
Thanks.
... View more
Labels:
- Labels:
-
Apache NiFi
05-09-2016
03:46 AM
Thanks guys for confirming what I was thinking (you can only listen to a port on your host machine). @Abdelkrim Hadjidj The source data is streaming patient data in HL7 message format. Thanks for the 2 options.
... View more
05-09-2016
03:41 AM
Thanks very much for the detailed answer, ideas, screenshots, and sample code, that's very helpful.
... View more
05-08-2016
04:12 AM
1 Kudo
Hi All, May be this is a networking question and not specific to NiFi. I've a locally installed NiFi instance on my laptop, we don't have NiFi installed on our Hadoop cluster yet. On a remote server a TCP/IP port is setup, to receive real-time streaming data. Using NiFi's ListenTCP processor on my local machine, is it possible to listen to the port on the remote server and receive the real-time streaming data into my local NiFi instance. Thanks.
... View more
Labels:
- Labels:
-
Apache NiFi
05-08-2016
03:53 AM
1 Kudo
Hi All, Here is some background on what I want to do: I'm reading HL7 message files and want to create an HBase table with that data. The ExtractHL7Attributes processor parses HL7 messages and creates attributes in this fashion - <Segment Name> <dot> <Field Index>. If the segment is repeating, the naming will be <Segment Name> <underscore> <Segment Index> <dot> <Field Index>. Examples of attributes it creates are "MHS.12" with a value of "2.1" and "OBX_11.3" with a value of "93000^CPT4". I want to write these attributes and their values to an HBase table, where the attributes (MHS.12) become column names and the values become the cells in the HBase table. This works fine, but the column names MHS.12 and OBX_11.3 are not very meaningful for an HBase table. So, using an UpdateAttributes processor, I'm able to rename the columns as follows: OBX_1.1 --> OBX_1_Set
ID OBX_1.2 --> OBX_1_ValueType . . . OBX_8.1 --> OBX_8_Set ID OBX_8.2 --> OBX_8_ValueType But for this to work, you need to know how many attributes (or OBX segments) will be in the flow file at design time; Since OBX segment can repeat itself many times in the message and since
we can’t anticipate the max number of OBX segments a message could have at
design time, I am unable to rename all attributes at design time; Any suggestions on how to make this work ? Is creating a custom processor the route to take ? Thanks in advance.
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache NiFi
05-03-2016
03:39 PM
I am reading flat files from a Unix directory, there are about 8 million files, the files are small, 1Kb-2Kb; but, that may have been the problem; thanks for the suggestion, I will look into back pressure and scheduling; thanks Pierre
... View more
05-02-2016
12:46 AM
When I experienced this issue, all NiFi Processors are stopped; the only thing was there was one connection that was queued up; trying to empty the queue was not working (a progress bar would show up, but it would just stay at 0% and not progress) Checked the nifi-app.log file; "java.lang.OutOfMemoryError: GC overhead limit exceeded" is in the log quite a lot. Since I'm running NiFi on my local laptop, I unzipped/installed NiFi again in any another folder and exported/imported the dataflow as a template from the old folder to the new folder; that seems to have resolved the issue I was facing.
... View more
- « Previous
- Next »