<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Nifi MergeRecord - can you merge on 2 different attributes. in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeRecord-can-you-merge-on-2-different-attributes/m-p/305389#M222426</link>
    <description>&lt;P&gt;Good afternoon&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am trying to make improvements to the way we make our Nifi flows by implementing Record processing.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Historically we:&lt;/P&gt;&lt;P&gt;split text &amp;gt; get timestamps using regex &amp;gt; merge on 'corellation_id' (attribute from timestamp: format- yyyy-MM-dd-HH) &amp;gt; extract sourcetype using regex (flow splits at this point for each sourcetype) &amp;gt; merge again on 'correlation_id' &amp;gt; out to HDFS/Splunk etc&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;flow structured like this:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Old  Split/Merge Flow" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29341i27570007E0ADEB51/image-size/medium?v=v2&amp;amp;px=400" role="button" title="SplitMergeProcessorFlow.PNG" alt="Old  Split/Merge Flow" /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Old  Split/Merge Flow&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;(each spine is a different sourcetype)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I recently watched this series of videos&lt;/P&gt;&lt;P&gt;&lt;A href="https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtube.com%2Fchannel%2FUCcNsK56MZRXzPbRfe8sODWw&amp;amp;data=02%7C01%7CAndrew.Griggs101%40mod.gov.uk%7Cd7c039cff4184cce81ad08d86c3b6dad%7Cbe7760ed5953484bae95d0a16dfa09e5%7C0%7C0%7C637378349780792956&amp;amp;sdata=xzFU6%2FChQpj%2FU4ElzFdiTqSGVwhV61sdQ3mKz8j7wa0%3D&amp;amp;reserved=0" target="_blank" rel="noopener"&gt;https://www.youtube.com/channel/UCcNsK56MZRXzPbRfe8sODWw&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I was particularly interested in the first one where Record Processing is used in place of split/merge/regex etc.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As a result I have started to re-design the flow above, the logs come into us in CEF format from a syslogListener (TCP).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My progress so far in processors:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;GenerateFlowfile - using some copied data from original flow.&lt;/P&gt;&lt;P&gt;SplitText - to split text/plain content into single logs (not figured out a way to get around splitting)&lt;/P&gt;&lt;P&gt;ParseCEF - parses CEF into JSON format&lt;/P&gt;&lt;P&gt;JoltTransformJSON - 'default' operation - to add some info to the JSON 'extension' content (feedname etc.)&amp;nbsp;&lt;/P&gt;&lt;P&gt;JoltTransformJSON - ''shift' operation - to strip out unwanted lines, just keeping lines I later need as attributes&amp;nbsp; + 'raw_content' line (I haven't figured out how to achieve these two steps using one Jolt processor? is there a way?)&lt;/P&gt;&lt;P&gt;EvaluateJsonPath - pulls the above attributes from the content.&lt;/P&gt;&lt;P&gt;MergeRecord - Currently merging on 'sourcetype'.....&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;That's as far as I have got,&amp;nbsp; the flow is using a lot less processors and achieving (nearly) the same results. I have also compared the flow file linage duration, my new flow is around 5secs and the old style takes around 35secs (although could be tweaked).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="WIP Process Record Flow" style="width: 332px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29342iD0E5B2CF2F9CFC05/image-size/medium?v=v2&amp;amp;px=400" role="button" title="RecordProcessorFlow.PNG" alt="WIP Process Record Flow" /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;WIP Process Record Flow&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My question is, is there a way to merge on two attributes? 'sourcetype' and 'correlation_id' in my case I don't want different sourcetypes ("deviceVendor" in JSON format) merged together in HDFS and I also want to merge/group by timestamps (correlation_id) in one hour bins.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to try to get away from having 11 different spines for each different sourcetype, is the mergeRecord clever enough to group by common content and correlate from my time&amp;nbsp;(correlation_id) attribute? or is this asking too much?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this makes sense, any assistance would be greatly appreciated.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 21 Apr 2026 11:23:32 GMT</pubDate>
    <dc:creator>Griggsy</dc:creator>
    <dc:date>2026-04-21T11:23:32Z</dc:date>
    <item>
      <title>Nifi MergeRecord - can you merge on 2 different attributes.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeRecord-can-you-merge-on-2-different-attributes/m-p/305389#M222426</link>
      <description>&lt;P&gt;Good afternoon&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am trying to make improvements to the way we make our Nifi flows by implementing Record processing.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Historically we:&lt;/P&gt;&lt;P&gt;split text &amp;gt; get timestamps using regex &amp;gt; merge on 'corellation_id' (attribute from timestamp: format- yyyy-MM-dd-HH) &amp;gt; extract sourcetype using regex (flow splits at this point for each sourcetype) &amp;gt; merge again on 'correlation_id' &amp;gt; out to HDFS/Splunk etc&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;flow structured like this:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Old  Split/Merge Flow" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29341i27570007E0ADEB51/image-size/medium?v=v2&amp;amp;px=400" role="button" title="SplitMergeProcessorFlow.PNG" alt="Old  Split/Merge Flow" /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Old  Split/Merge Flow&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;(each spine is a different sourcetype)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I recently watched this series of videos&lt;/P&gt;&lt;P&gt;&lt;A href="https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtube.com%2Fchannel%2FUCcNsK56MZRXzPbRfe8sODWw&amp;amp;data=02%7C01%7CAndrew.Griggs101%40mod.gov.uk%7Cd7c039cff4184cce81ad08d86c3b6dad%7Cbe7760ed5953484bae95d0a16dfa09e5%7C0%7C0%7C637378349780792956&amp;amp;sdata=xzFU6%2FChQpj%2FU4ElzFdiTqSGVwhV61sdQ3mKz8j7wa0%3D&amp;amp;reserved=0" target="_blank" rel="noopener"&gt;https://www.youtube.com/channel/UCcNsK56MZRXzPbRfe8sODWw&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I was particularly interested in the first one where Record Processing is used in place of split/merge/regex etc.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As a result I have started to re-design the flow above, the logs come into us in CEF format from a syslogListener (TCP).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My progress so far in processors:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;GenerateFlowfile - using some copied data from original flow.&lt;/P&gt;&lt;P&gt;SplitText - to split text/plain content into single logs (not figured out a way to get around splitting)&lt;/P&gt;&lt;P&gt;ParseCEF - parses CEF into JSON format&lt;/P&gt;&lt;P&gt;JoltTransformJSON - 'default' operation - to add some info to the JSON 'extension' content (feedname etc.)&amp;nbsp;&lt;/P&gt;&lt;P&gt;JoltTransformJSON - ''shift' operation - to strip out unwanted lines, just keeping lines I later need as attributes&amp;nbsp; + 'raw_content' line (I haven't figured out how to achieve these two steps using one Jolt processor? is there a way?)&lt;/P&gt;&lt;P&gt;EvaluateJsonPath - pulls the above attributes from the content.&lt;/P&gt;&lt;P&gt;MergeRecord - Currently merging on 'sourcetype'.....&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;That's as far as I have got,&amp;nbsp; the flow is using a lot less processors and achieving (nearly) the same results. I have also compared the flow file linage duration, my new flow is around 5secs and the old style takes around 35secs (although could be tweaked).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="WIP Process Record Flow" style="width: 332px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/29342iD0E5B2CF2F9CFC05/image-size/medium?v=v2&amp;amp;px=400" role="button" title="RecordProcessorFlow.PNG" alt="WIP Process Record Flow" /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;WIP Process Record Flow&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My question is, is there a way to merge on two attributes? 'sourcetype' and 'correlation_id' in my case I don't want different sourcetypes ("deviceVendor" in JSON format) merged together in HDFS and I also want to merge/group by timestamps (correlation_id) in one hour bins.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to try to get away from having 11 different spines for each different sourcetype, is the mergeRecord clever enough to group by common content and correlate from my time&amp;nbsp;(correlation_id) attribute? or is this asking too much?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this makes sense, any assistance would be greatly appreciated.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Apr 2026 11:23:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeRecord-can-you-merge-on-2-different-attributes/m-p/305389#M222426</guid>
      <dc:creator>Griggsy</dc:creator>
      <dc:date>2026-04-21T11:23:32Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi MergeRecord - can you merge on 2 different attributes.</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeRecord-can-you-merge-on-2-different-attributes/m-p/307522#M223237</link>
      <description>&lt;P&gt;You can make a new attribute with UpdateAttribute that merges those two together&lt;/P&gt;</description>
      <pubDate>Fri, 11 Dec 2020 15:30:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-MergeRecord-can-you-merge-on-2-different-attributes/m-p/307522#M223237</guid>
      <dc:creator>TimothySpann</dc:creator>
      <dc:date>2020-12-11T15:30:04Z</dc:date>
    </item>
  </channel>
</rss>

