Support Questions

ahmedalsaidi · ‎04-09-2023

I am consuming a messages from IBM MQ using processor (ConsumeJMS), then once you explore as a "List queue" you will see the following type and format:

Hence, I added a following processor (ExtractText 1.16.1) in order to search for some texts using regular expression as following: (ACT.A.EMS..ST...AIR|ACT.X.EMS..ST...AIR)

Finally, Sometimes the results of regular expression are valid and sometimes it satisfies a half of my condition because:

1) if it does not match anything hence it match only the part of my condition.

2) or if the half of my condition is valid; hence it will match!

Is there something wrong I am doing in my regular expression or something else? or I need to change type, format, or content type (application/octet-stream) to text so that I can search correctly instead of viewed as a hex?

I hope my explanation was fine and clear for everyone.

Thanks!

MattWho · ‎04-10-2023

@ahmedalsaidi
You do not need to change the content type since you specify the character set to use in the ExtractText processor which defaults to "UTF-8". If you change the content type or the filename, the built in content viewer in NiFi would be able to display text instead of hex. For example: adding ".txt" to end of filename.

When it comes to your matching issue, it would be difficult for me to say what is happening here without a working and non-working sample to look at. In Java regular expressions the "." means any character; however, looking at you hex output screenshot it looks like you really want a literal "." to match. If you want your java regular expression to match the literal ".", then add a "\" (backslash) before each ".".

If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped.

Thank you,

Matt

Cloudera Community

Support Questions

Issue in Regular expression in nifi