Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Nifi to HDFS filename leaving off minutes

Rising Star

We have a recurring issue we haven't been able to solve.

Take a look at our flow. We're taking logfiles, extracting date and time attributes to use as the filename when writing them to HDFS

It mostly works but we keep intermittently getting filenames that are missing the minutes field.

2017_05_16_13_55_fozziesplunkr.log

2017_05_16_13_56_fozziesplunkr.log

2017_05_16_13_57_fozziesplunkr.log

2017_05_16_13__fozziesplunkr.log <---this is the file. It contains entries from minutes 55, 56 and 57. It feels like a "catch-all" file.

Our setup: 6 hosts sending to 3 topics, 2 hosts per topic. It is generating at about 10k/messages per second of 1 million events. This is just our test data.

On our sending side, we are doing TailFile -> Control Rate (1 MB) -> PublishKafka (this seems to work well)

On our receiving side, there are screenshots of one of our topics with every processor and its tabs.

We used ConsumeKafka -> ExtractText -> UpdateAttribute (Regex for timestamp from log) -> MergeContent -> UpdateAttribute (Create filename) -> PutHDFS

All of these have screenshots as shown. If anyone has had this problem and has any idea on a solution, that'd be welcome. We've tried all kinds of performances tweaks without success. Nifi logs show no warnings or errors.

Thanks.

Flow Overview and Odd missing minute in Filenames

15443-overviewfiles.jpg

ConsumeKafka Processor and Update Attribute (Create Filename) Processor

15444-consumekafkacreatefilename.jpg

ExtractText (Extract from Syslog -Regex) Processor and Update Attribute (assign to attributes) processor

15445-extracttext.jpg

PutHDFS & Merge Content

15446-puthdfsmergecontent.jpg

3 REPLIES 3

Rising Star

Sorry I was having problems uploading. It kept timing out so I guess all three ended up posting. How do I close it? The delete option is not available (and I think its because this question has a comment on it, this comment here...)

15537-screen-shot-2017-05-18-at-91405-am.png

Rising Star

I reported it to get it removed.