Created 05-06-2021 04:20 AM
I have identified that some of my Ranger logs are coming in in gz format (but without the extension). Is there a way of extracting the log data within the Nifi flow?
Created 05-06-2021 01:37 PM
The "CompressContent" [1] processor can be used to decompress gz files.
My suggestion here since only some log files are compressed is to set up a flow that passes these FlowFiles through an "IdentifyMimeType" [2] processor. This will write out a new mime.type attribute on the FlowFiles. Then use a "RouteOnAttribute" [3] processor to route FlowFiles with mime.type = application/gzip (each new dynamic property becomes a new outbound relationship) to that "CompressContent" processor and the "unmatched" relationship (which will have every other non gz file) on through your flow without passing through "CompressContent" processor.
[1] http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apache...
If you found this helpful with your question, please take a moment to login and click accept on this solution.
Thanks,
Matt
Created 05-06-2021 01:37 PM
The "CompressContent" [1] processor can be used to decompress gz files.
My suggestion here since only some log files are compressed is to set up a flow that passes these FlowFiles through an "IdentifyMimeType" [2] processor. This will write out a new mime.type attribute on the FlowFiles. Then use a "RouteOnAttribute" [3] processor to route FlowFiles with mime.type = application/gzip (each new dynamic property becomes a new outbound relationship) to that "CompressContent" processor and the "unmatched" relationship (which will have every other non gz file) on through your flow without passing through "CompressContent" processor.
[1] http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apache...
If you found this helpful with your question, please take a moment to login and click accept on this solution.
Thanks,
Matt
Created 05-07-2021 06:28 AM
Thanks, I tried something just like this yesterday, and it would work, but these logs don't have a file extension like what would be needed, (.gz) but I think I have an issue somewhere else, so will carry on trying to fix the overall issue.