Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Who agreed with this topic

Matching an input field with multiple regular expressions

Explorer

Hi...

 

Is there any "compact" way of defining and trying multiple regular expressions on the same field, until one of them matches? The "grok" command requires different field names for different expressions (e.g.,"message1", "message2" and "message3" for 3 different regular expressions, instead of using "message" and trying that field until one of the 3 matches, or all are tried).

 

Of course, one can use tryRules with a different grok command per expression, e.g.:

 

          {
            tryRules {
              rules: [
                {
                  commands: [
                    {
                      grok {
                        dictionaryResources: [...]
                        expressions: {
                          message: """<expression1>"""
                        }
                      }
                    }
                  ]
                }
                {
                  commands: [
                    {
                      grok {
                        dictionaryResources: [conf/etl/grok-dictionaries/patterns]
                        expressions: {
                          message: """<expression2>"""
                        }
                      }
                    }
                  ]
                }

...
{ commands: [ { grok { dictionaryResources: [...] expressions: { message: """<expressionN>""" } } } ] } { commands : [ # No expression matched { dropRecord {} } ] } ] } }

 

The above works, but it has around 5 times more lines than necessary.

 

Any ideas for a more compact notation, with existing morphline commands?

 

Thanks.

 

 

Who agreed with this topic