Member since
04-29-2021
35
Posts
0
Kudos Received
0
Solutions
09-02-2021
09:18 AM
@Justee First thing I would do is add a new Attribute on my FlowFile that specifies the year I'd be searching for in the lines contained within the content of that FlowFile. (optional) For example adding an attribute "year" with a value of "1995". In the routeText processor, I'd then be able to use NiFi Expression Language (NEL) in my java regular expression as supported by this processor component: ^\|(.*?)\|(.*?)\|${year}\|(.*?)$ The above java regular expression will match on lines that begin with a pipe "|" followed by a non greedy wildcard match of one or more character until the very next pipe "|", then again for field 2, then for field three I used NEL which resolves to "1995", and then finally i match via wildcard the remainder of the line. Of course you could simply put "1995" in place of "${year}" in the above regex. The routeText processor component configuration would look like this: The result would be two FlowFiles. One FlowFile would be routed to the relationship "1995" (based on property name used) which would have content only containing lines with "1995". The second FlowFile would route to the "unmatched" relationship and would contain all the non-matching lines ( you may to choose to just auto-terminate this relationship if you don't care about these lines). If you found these responses addressed your query, please take a moment to login and click on "Accept as Solution" below each response that helped you. Thank you, Matt
... View more
08-19-2021
11:46 AM
Hi @Justee You can check out the link that was offered in response to a similar question that was asked and answered on the Cloudera Community earlier this year here: Re: ASN.1 BER Parser
... View more
07-16-2021
07:12 AM
Hi @stevenmatison, I don't see any data in the cassandra database. I don't see any errors either. Here are the screenshots.
... View more
06-21-2021
11:03 PM
Hi @ChethanYM, I just use hadoop. I followed This tutorial , but I did the installation steps of hadoop only. https://www.youtube.com/watch?v=71EQblrUPRM&t=1375s
... View more
06-17-2021
02:02 PM
1 Kudo
Hi @Justee It would be helpful to community members inclined to answer your question if you included which version of the HDP Sandbox you're using, and on what OS and virtualization system (i.e., Docker, VirtualBox or VMWare) you installed it on.
... View more
06-14-2021
10:31 AM
@Justee The ExecuteProcess [1] processor by design does not allow an inbound connection. It is designed to create an output FlowFile and does not except a FlowFile as input. You can connect getFile to the ExecuteStreamCommand [2] processor [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apache.nifi.processors.standard.ExecuteProcess/index.html [2] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apache.nifi.processors.standard.ExecuteStreamCommand/index.html
... View more
06-10-2021
11:55 PM
If you have multiple data sources then you can create a processor for each and have them aggregate in a central processor/group. In the picture below we have 6 different data sources (API's, FTP's, etc), each in its own processor group. Within their respective groups they reformat the data so that it is uniform and send it on its way. I had to blank out some text but this should give you the idea.
... View more
06-10-2021
05:05 AM
try this: open your cmd: change directory with : cd C:\kafka_2.11-2.0.0 if you change directory you will do this C:\kafka_2.11-2.0.0> bin\zookeeper-server-start.bat config\zookeeper.properties C:\kafka_2.11-2.0.0> bin\kafka-server-start.bat config\server.properties if you don't change directory do this. >C:\kafka_2.11-2.0.0\bin\zookeper-server-start.bat C:\kafka_2.11-2.0.0\config\zookeeper.properties > C:\kafka_2.11-2.0.0\bin\kafka-server-start.bat C:\kafka_2.11-2.0.0\config\server.properties
... View more
06-01-2021
02:29 PM
1 Kudo
CountText will count lines (\r\n). QueryRecord will count # of records, even if it is two records on a line
... View more