Created 06-02-2016 11:24 AM
Hi,
I've got another problem with the logfiles linux weblogic. I've tested several cases (see below) . And I don't know how ingest new data without stop the processor, cancel the state and restart Nifi agent.
First question : Is there another way to do ?
Second question : Is it possible to script the stop processor and the clear state ?
Thanks
The TailFile processor properties :
File to Tail /home/wls.log
Rolling Filename Pattern
State File No value set
Initial Start Position Beginning of File
File Location Local
Test 1 :
cat ‘record01’ >> wls.log
The NiFi Flow Data Provenance is OK. The line ‘record01’ appears.
mv wls.log wls.log.01
cat ‘record02’ >> wls.log
The NiFi Flow Data Provenance is KO. The line ‘record02’ doesn’t appear.
The line ‘record02’ recovers only when I stop the processor TailFile, clear state and restart Nifi.
Test 2 :
cat ‘record01’ >> wls.log
The NiFi Flow Data Provenance is OK. The line ‘record01’ appears.
cp wls.log wls.log.01
rm wls.log
cat ‘record02’ >> wls.log
The NiFi Flow Data Provenance is KO. The line ‘record02’ doesn’t appear.
The line ‘record02’ appears only when I stop the processor TailFile, clear state and restart Nifi.
Test 3 :
cat ‘record01’ >> wls.log
The NiFi Flow Data Provenance is OK. The line ‘record01’ appears.
touch empty.txt
cp empty.txt wls.log (inode ws.log doesn’t change)
cat ‘record02’ >> wls.log
The NiFi Flow Data Provenance is KO. The line ‘record02’ doesn’t appear.
The line ‘record02’ appears only when I stop the processor TailFile, clear state and restart Nifi.
Created 06-02-2016 09:12 PM
Hi @Thierry Vernhet,
I just had a look and I believe this is because the property "Rolling filename pattern" is not set. In such case, the processor does not detect the file has changed and it does not reset its state.
I have raised a JIRA to track this issue to improve things in such a situation: https://issues.apache.org/jira/browse/NIFI-1959
Created 06-02-2016 03:18 PM
Can you retry all these tests and during the second cat, instead of "cat 'record02' ", cat something longer like "cat 'record123456789'". I'd like to see if tracking the file size is the issue, because record01 and record02 would be the same file size.
Created 06-03-2016 05:53 AM
Hi @Bryan Bende
I' gonna test this and give you a feedback. Please could you also have a look to Pierre's answer ?
Created 06-03-2016 07:05 AM
Hi @Bryan Bende
So with the second echo "record0123456789" >> ... (I've made a mistake in my original question, it's an echo instead of a cat command)... the results are
Test1 the same
Test2 the same
Test3 .... not the same result
cp empty.txt wls.log
-rw-r--r-- 1 itve6530 userdsi 0 Jun 3 08:44 wls.log
echo "record33333333333xxxx0123456789" >> wls.log
Nifi Flow Data Provenance shows : 06/03/2016 08:44:24.759 CEST RECEIVE 389968cd-a1b6-497d-9e0a-c0e93ffe90f6 4 bytes
And the 4 bytes are "789" the last of the echo command which contains 32 bytes ... funny, isn't it ?
So I wrote another different record
echo "record33333333333xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0123456789" >> wls.log
Nifi Flow Data Provenance shows :06/03/2016 08:45:32.218 CEST RECEIVE fb6d2013-03ab-4576-989e-02e2bd9981f4 64 bytes with all the 64 bytes of the echo command... in progress !
So I wrote another different record
echo "record44" >> wls.log
OK in Nifi Flow Data Provenance :
06/03/2016 08:57:58.926 CEST RECEIVE e253f961-3a8a-43a2-97b1-25d54526a894 9 bytes
echo "record55" >> wls.log
KO in Nifi Flow Data Provenance (same size than previous record)
echo "record5566" >> wls.log
OK in Nifi Flow Data Provenance (differnent siez). Now it shows the last and the one before the last.
06/03/2016 08:58:28.013 CEST RECEIVE 3c646f8c-84c8-4353-af66-e7e06fc76560 11 bytes
06/03/2016 08:58:10.944 CEST RECEIVE e158b0ce-a856-44f9-b7cf-4769ce8b0f69 9 bytes
Very interesting ...
Created 06-02-2016 09:12 PM
Hi @Thierry Vernhet,
I just had a look and I believe this is because the property "Rolling filename pattern" is not set. In such case, the processor does not detect the file has changed and it does not reset its state.
I have raised a JIRA to track this issue to improve things in such a situation: https://issues.apache.org/jira/browse/NIFI-1959
Created 06-03-2016 05:57 AM
I've tried this in a similar situation (see the question posted Nifi : How avoid ingesting an old rolling file in TailFileProcessor ?)
And I've got a problem with the odl rolled file that is always ingested when a new record is written into the new file. But there was a little difference. So I'll test in this situation and give you my feedback.
Created 06-06-2016 09:27 AM
Hi @Thierry Vernhet,
I've tried to reproduce the issue but when I set the rolling filename pattern, it works as expected. Here are the steps I did:
echo "test" >> /tmp/test.log
echo "test" >> /tmp/test.log
mv /tmp/test.log /tmp/test.log.01
echo "test" >> /tmp/test.log
With the rolling filename pattern set to test.log.*
The JIRA I raised is regarding the case the property is not set and I submitted a PR for that. But in case the property is not set, at this point I am not able to reproduce the error. Are there specific properties regarding your environment? I could suggest you to turn logging to DEBUG level and check if you have some interesting messages...
Created 06-06-2016 01:08 PM
Created 06-06-2016 01:20 PM
I'm using the same as you: Beginning of file.
I think the version should not matter on this processor: it didn't change since 5 months. However if you are able to try with the PR mentioned above, it may be worth it.
Created 06-03-2016 09:08 AM
Hi @Pierre Villard and @Bryan Bende
I also tested with rolling file pattern property = wls.log and initial start = beggining of file
Test1 the same result
Test2 the same resut
Test3 .... the same result than without file pattern property (see my previous comment to Bryan)
Nifi Flow Data Provenance shows : 06/03/2016 08:44:24.759 CEST RECEIVE 389968cd-a1b6-497d-9e0a-c0e93ffe90f6 4 bytes
.
But I want to complete my comment.
The 4 bytes (of the 32) "789" are the differnece beetween the size of wls.log file (before its init with "cp empty.txt wls.log") minus the size of the new record.
So if the size before init of wls.log is 10 MB (for example), the next new records won't be ingested as the new records size will not reach 10 MB.
.
I think there's a bug...
So do you think it's possible to contact Hortonworks support to improve Nifi product in this case ?
.
Thanks for your feedback.