About elloyd

elloyd · ‎10-13-2017

Thanks that was a great answer.

elloyd · ‎10-12-2017

Hello, we are seeing some behavior and it seems to indicate something but I want to verify with someone who knows how TailFile processor works when tailing multiple files. Heres our setup... We have a cluster of two nodes for Nifi. We are tailing a specific log, call it foo.log, located in different versions in a versions folder. To illustrate... we are tailing these files /var/foobar/versions/123.1/foo.log /var/foobar/versions/234.2/foo.log Now, upon initial run of the TailFile processor, the foo.log in 123.1 is not currently receiving anymore data since now that data is coming into 234.2, the newer version. What we are seeing is that any data being tailed is only coming from 234.2 (which is awesome, and what we want to happen - we feared it would read in the foo.log from 123.1 despite it not receiving anymore data as well as the incoming data form 234.2) is it the Tailfile's functionality to only tail the files that are receiving data and the ones that aren't it doesnt? This would indicate to me when we do another version and /var/foobar/versions/332.22/foo.log appears and data stops going into 234.2 it would stop tailing from 234.2 (makes sense) and start pulling data form 332.22... testing this is proving rather difficult so I was hoping we could get some verification from someone who knows the functionality better. PS: we have managed to use regex to indicate to grab foo.log from any folder under versions which are composed of decimals and digits so that seems to be working.

elloyd · ‎10-12-2017

Thanks for the answer but maybe I should have clarified. This solution works for my testing environment but my final environment is actually monitoring the versions directory... so it won't always be test1 and test1 but rather ../versions/123.1 .../versions/234.2 and then a new version will be added that is an unknown number. So that is why I was hoping to use a wildcard like: * Can I use a wildcard? It seems you are using an expression language, maybe with /versions/[*]

elloyd · ‎10-11-2017

I am attempting to use TailFile processor to retrieve files from multiple directories with the filename having the same name. For example: /var/foo/test1/test.log /var/foo/test2/test.log I want to get both of the test.log files. I have tried using Multiple File mode with /var/foo/*/test.log but that seems to return an error. Is there a way to do this? Thanks

elloyd · ‎09-20-2017

Update: I had 4 flows running at the same time using these Zookeeper instances. When I reduced them from 4 to 2, Zookeeper no longer crashes.

elloyd · ‎09-18-2017

We have a cluster with 3 zookeepers running on say servers: s1, s2, and s3. Zookeeper on s3 keeps refusing to connect but not right away after I start it. It will take about 3-4 hours, then it will say the error: Connection failed: [Errno 111] Connection refused to s3.foo.com:2181 in Ambari. In the zookeeper log, I found this copy pasted below. I have changed the amount of space on the s3 device and there is plenty left. Any ideas? 2017-09-18 19:48:16,333 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1308ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:18,731 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1467ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:20,164 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1431ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:21,524 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1359ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:23,696 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 2170ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:25,416 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1576ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:28,116 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 2699ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:30,369 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 2249ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:31,564 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1195ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:33,884 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1533ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:39,154 - WARN[SyncThread:3:FileTxnLog@334] - fsync-ing the write ahead log in SyncThread:3 took 1516ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide 2017-09-18 19:48:41,043 - ERROR [SyncThread:3:SyncRequestProcessor@183] - Severe unrecoverable error, exiting java.io.IOException: No space left on device at java.io.FileOutputStream.writeBytes(Native Method) at java.io.FileOutputStream.write(FileOutputStream.java:326) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) at org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:322) at org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:322) at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:491) at org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:196) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:131)

elloyd · ‎07-13-2017

Currently, we have a setup where we are trying to group together events based on timestamp and split them based on timestamp in order to keep the stack trace error which have newlines in them. We are currently using the SplitContent to split on: "(newline) 20" This is the format of the logs: (this example is supposed to appear single spaced, I hope it does when posted) 2017-07-13 01:00:00,123 Log data here 2017-07-13 01:00:00,124 Log data here 2017-07-13 01:00:00,125 Stack trace error here... Stack trace error .... Stack trace error.... Stack trace error..... 2017-07-13 01:00:00,126 Log data here Using "(newline) 20" allows us to maintain everything between the timestamps as an event, including the stack trace. Oddly enough it will produce in HDFS events with an extra blank line in between each event. (Not usually a big deal but with 2-3 GB files, we are seeing 100+ MB of just space for blank lines) Our current solution is to have ReplaceText processor that will remove all the blank lines but its obviously not optimal. Any suggestions are welcome. Please see screenshots.

elloyd · ‎07-11-2017

Wow the solution to remove the extra lines was creating a giant bottleneck... Hmm

elloyd · ‎07-11-2017

@Matt Clarke It seems to does retain the first line. A valid point you made and in reality your solution would ensure it ran forever and even if there was a (newline) 20 in the code somehow (not sure I see how it'd happen), it would still be okay. Thanks for you help.

elloyd · ‎07-11-2017

Thanks thats a solution if the newlines become a problem. You know what else worked for me? I want to put this out there Matt and see if you see any errors in this logic. Instead of the ReplaceText and all that, I simple added a SplitContent and since I know each event starts with the year, I did a Split Text on: "(newline) 20" I realize this solution wouldn't work forever but it would work for the remainder of the century I think. I guess an issue might arise if somehow, in someway, a log produces a newline and then a 20... I am just concerned ReplaceText might be an extrenous time-consuming add-on

Online	Offline
Last Visited	‎03-14-2018 05:14 PM

Member Since	‎01-05-2017 02:25 PM
Last Visited	‎03-14-2018 05:14 PM
Posts	153
Kudos received	10

Cloudera Community

Re: TailFile cannot find directory/file which exis...

Re: Unusual data placement on file rollover in Nif...

Re: How TailFile works with multiple files

How TailFile works with multiple files

Re: In Nifi Tailing multiple directories with the ...

In Nifi Tailing multiple directories with the same...

Re: Zookeeper keeps refusing to connect

Zookeeper keeps refusing to connect

SplitContent is adding a newline in Nifi -why?

Re: Logging a Stack Trace event with Nifi

Re: Logging a Stack Trace event with Nifi

Re: Logging a Stack Trace event with Nifi