Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Tailfile with rotate log files

avatar
Rising Star

Hi,

I've got another problem with the logfiles linux weblogic. I've tested several cases (see below) . And I don't know how ingest new data without stop the processor, cancel the state and restart Nifi agent.

First question : Is there another way to do ?

Second question : Is it possible to script the stop processor and the clear state ?

Thanks

The TailFile processor properties :

File to Tail /home/wls.log

Rolling Filename Pattern

State File No value set

Initial Start Position Beginning of File

File Location Local

Test 1 :

cat ‘record01’ >> wls.log

The NiFi Flow Data Provenance is OK. The line ‘record01’ appears.

mv wls.log wls.log.01

cat ‘record02’ >> wls.log

The NiFi Flow Data Provenance is KO. The line ‘record02’ doesn’t appear.

The line ‘record02’ recovers only when I stop the processor TailFile, clear state and restart Nifi.

Test 2 :

cat ‘record01’ >> wls.log

The NiFi Flow Data Provenance is OK. The line ‘record01’ appears.

cp wls.log wls.log.01

rm wls.log

cat ‘record02’ >> wls.log

The NiFi Flow Data Provenance is KO. The line ‘record02’ doesn’t appear.

The line ‘record02’ appears only when I stop the processor TailFile, clear state and restart Nifi.

Test 3 :

cat ‘record01’ >> wls.log

The NiFi Flow Data Provenance is OK. The line ‘record01’ appears.

touch empty.txt

cp empty.txt wls.log (inode ws.log doesn’t change)

cat ‘record02’ >> wls.log

The NiFi Flow Data Provenance is KO. The line ‘record02’ doesn’t appear.

The line ‘record02’ appears only when I stop the processor TailFile, clear state and restart Nifi.

1 ACCEPTED SOLUTION

avatar

Hi @Thierry Vernhet,

I just had a look and I believe this is because the property "Rolling filename pattern" is not set. In such case, the processor does not detect the file has changed and it does not reset its state.

I have raised a JIRA to track this issue to improve things in such a situation: https://issues.apache.org/jira/browse/NIFI-1959

View solution in original post

12 REPLIES 12

avatar
Master Guru

Can you retry all these tests and during the second cat, instead of "cat 'record02' ", cat something longer like "cat 'record123456789'". I'd like to see if tracking the file size is the issue, because record01 and record02 would be the same file size.

avatar
Rising Star

Hi @Bryan Bende

I' gonna test this and give you a feedback. Please could you also have a look to Pierre's answer ?

avatar
Rising Star

Hi @Bryan Bende

So with the second echo "record0123456789" >> ... (I've made a mistake in my original question, it's an echo instead of a cat command)... the results are

Test1 the same

Test2 the same

Test3 .... not the same result

cp empty.txt wls.log

-rw-r--r-- 1 itve6530 userdsi 0 Jun 3 08:44 wls.log

echo "record33333333333xxxx0123456789" >> wls.log

Nifi Flow Data Provenance shows : 06/03/2016 08:44:24.759 CEST RECEIVE 389968cd-a1b6-497d-9e0a-c0e93ffe90f6 4 bytes

And the 4 bytes are "789" the last of the echo command which contains 32 bytes ... funny, isn't it ?

So I wrote another different record

echo "record33333333333xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0123456789" >> wls.log

Nifi Flow Data Provenance shows :06/03/2016 08:45:32.218 CEST RECEIVE fb6d2013-03ab-4576-989e-02e2bd9981f4 64 bytes with all the 64 bytes of the echo command... in progress !

So I wrote another different record

echo "record44" >> wls.log

OK in Nifi Flow Data Provenance :

06/03/2016 08:57:58.926 CEST RECEIVE e253f961-3a8a-43a2-97b1-25d54526a894 9 bytes

echo "record55" >> wls.log

KO in Nifi Flow Data Provenance (same size than previous record)

echo "record5566" >> wls.log

OK in Nifi Flow Data Provenance (differnent siez). Now it shows the last and the one before the last.

06/03/2016 08:58:28.013 CEST RECEIVE 3c646f8c-84c8-4353-af66-e7e06fc76560 11 bytes

06/03/2016 08:58:10.944 CEST RECEIVE e158b0ce-a856-44f9-b7cf-4769ce8b0f69 9 bytes

Very interesting ...

avatar

Hi @Thierry Vernhet,

I just had a look and I believe this is because the property "Rolling filename pattern" is not set. In such case, the processor does not detect the file has changed and it does not reset its state.

I have raised a JIRA to track this issue to improve things in such a situation: https://issues.apache.org/jira/browse/NIFI-1959

avatar
Rising Star

Hi @Pierre Villard

I've tried this in a similar situation (see the question posted Nifi : How avoid ingesting an old rolling file in TailFileProcessor ?)

And I've got a problem with the odl rolled file that is always ingested when a new record is written into the new file. But there was a little difference. So I'll test in this situation and give you my feedback.

avatar

Hi @Thierry Vernhet,

I've tried to reproduce the issue but when I set the rolling filename pattern, it works as expected. Here are the steps I did:

echo "test" >> /tmp/test.log

echo "test" >> /tmp/test.log

mv /tmp/test.log /tmp/test.log.01

echo "test" >> /tmp/test.log

With the rolling filename pattern set to test.log.*

The JIRA I raised is regarding the case the property is not set and I submitted a PR for that. But in case the property is not set, at this point I am not able to reproduce the error. Are there specific properties regarding your environment? I could suggest you to turn logging to DEBUG level and check if you have some interesting messages...

avatar
Rising Star

Hi @Pierre Villard

Which value in your test for property "Initial Start Position" ?

Is the Nifi release important ? We use "0.5.1".

Thanks

avatar

I'm using the same as you: Beginning of file.

I think the version should not matter on this processor: it didn't change since 5 months. However if you are able to try with the PR mentioned above, it may be worth it.

avatar
Rising Star

Hi @Pierre Villard and @Bryan Bende

I also tested with rolling file pattern property = wls.log and initial start = beggining of file

Test1 the same result

Test2 the same resut

Test3 .... the same result than without file pattern property (see my previous comment to Bryan)

Nifi Flow Data Provenance shows : 06/03/2016 08:44:24.759 CEST RECEIVE 389968cd-a1b6-497d-9e0a-c0e93ffe90f6 4 bytes

.

But I want to complete my comment.

The 4 bytes (of the 32) "789" are the differnece beetween the size of wls.log file (before its init with "cp empty.txt wls.log") minus the size of the new record.

So if the size before init of wls.log is 10 MB (for example), the next new records won't be ingested as the new records size will not reach 10 MB.

.

I think there's a bug...

So do you think it's possible to contact Hortonworks support to improve Nifi product in this case ?

.

Thanks for your feedback.