Support Questions

Find answers, ask questions, and share your expertise

How to process text files with semi structured data and convert to JSON

Explorer

sample-data.txtupdateattributes-properties.pngI am trying to build a flow that takes a syslog input from a proofpoint (email) source and convert messages to JSON format. I am using ListenSyslog processor to get the data in and UpdateAttribute processor to get regular syslog properties. The problem is that UpdateAttributes doesn't "recognize" any of regular syslog properties except for syslog.port, syslog.protocol and syslog.sender. The messages are not uniformly constructed, with first 3 fields (delimited by white space) having same format, and the rest of the fields having various formats and lengths. The number of fields can also vary from line to line.

Is there a way to dynamically build JSON object without knowing incoming format and number of attributes?

8 REPLIES 8

Explorer

Yes I did try it, with same attributes as I put into UpdateAttributes - same result

@Alex M I am not sure I understand well your use case. From my understanding, you are receiving syslog message that you want to convert to JSON. If yes, then you can use ParseSyslog to get the attribute then use AttributesToJson to convert.

Attached a test I did. Can you import the template and test it and tell me if this is what are you looking for?

testhcc.xml

Explorer

I could not upload your template - get "The specified template is not in a valid format" error

Maybe because I am on NiFi 1.4. Can you try to use ListenSyslog -> ParseSyslog -> AttributesToJSON ?

Explorer

The processor doesn't recognize an incoming message as a valid syslog format

@Alex M do you mean the ParseSyslog? I tested with data you provided in your question and everything works fine for me. Can you details what are you doing with few screenshots ? I am not sure I understand what you need to achieve

Explorer

Yes. The file I attached contains a very minute subset of what's coming in. Here is the picture of my test flow and the error message below:

2017-11-17 11:08:50,535 ERROR [Timer-Driven Process Thread-9] o.a.nifi.processors.standard.ParseSyslog ParseSyslog[id=cae90699-015f-1000-5758-7ddd330c061d] Failed to parse StandardFlowFileRecord[uuid=3ed5f084-c4c2-4fcc-9682-f6203195f5b1,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1510938527608-3484, container=default, section=412], offset=941379, length=141207],offset=0,name=1472246469105863,size=141207] as a Syslog message: it does not conform to any of the RFC formats supported; routing to failure

flow.png