Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

Issue in Regular expression in nifi


I am consuming a messages from IBM MQ using processor (ConsumeJMS), then once you explore as a "List queue" you will see the following type and format:




Hence, I added a following processor (ExtractText 1.16.1) in order to search for some texts using regular expression as following: (ACT.A.EMS..ST...AIR|ACT.X.EMS..ST...AIR)



Finally, Sometimes the results of regular expression are valid and sometimes it satisfies a half of my condition because:

1) if it does not match anything hence it match only the part of my condition.

2) or if the half of my condition is valid; hence it will match! 

Is there something wrong I am doing in my regular expression or something else? or I need to change type, format, or content type (application/octet-stream) to text so that I can search correctly instead of viewed as a hex?


I hope my explanation was fine and clear for everyone.



Super Mentor

You do not need to change the content type since you specify the character set to use in the ExtractText processor which defaults to "UTF-8".  If you change the content type or the filename, the built in content viewer in NiFi would be able to display text instead of hex.  For example: adding ".txt" to end of filename.

When it comes to your matching issue, it would be difficult for me to say what is happening here without a working and non-working sample to look at.  In Java regular expressions the "." means any character; however, looking at you hex output screenshot it looks like you really want a literal "." to match.  If you want your java regular expression to match the literal ".", then add a "\" (backslash) before each ".".

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,