<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Merge json events based on property in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170379#M37339</link>
    <description>&lt;A rel="user" href="https://community.cloudera.com/users/1952/suri-1415.html" nodeid="1952"&gt;@BigDataRocks&lt;/A&gt; - I believe that Bryan's answer above is very accurate, so this is not really intended to directly answer your question but wanted to mention that your directory structure above can be simplified to just:&lt;P&gt;eventsink/${service_type}/${event_name}/${now():format('yyyy/MM/dd/HHmmssSSS')}.${filename}.json&lt;/P&gt;&lt;P&gt;As you have it above, you are asking for "now()" multiple times would could cause some weirdness if the hour rolls over between invocations, etc. Doing it all with a single call to now() will address this and simplifies the configuration as well.&lt;/P&gt;</description>
    <pubDate>Wed, 10 Aug 2016 02:45:15 GMT</pubDate>
    <dc:creator>mpayne</dc:creator>
    <dc:date>2016-08-10T02:45:15Z</dc:date>
    <item>
      <title>Merge json events based on property</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170377#M37337</link>
      <description>&lt;P&gt;The current workflow is exporting each event.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt; We are looking to merge all json events based on service/eventname and concatenate time and export them to s3.
Our requirement  on and merge them using expression language at the runtime&lt;/STRONG&gt;.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Aug 2016 23:03:29 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170377#M37337</guid>
      <dc:creator>bigspark</dc:creator>
      <dc:date>2016-08-09T23:03:29Z</dc:date>
    </item>
    <item>
      <title>Re: Merge json events based on property</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170378#M37338</link>
      <description>&lt;P&gt;The MergeContent processor can be used to merge JSON together and has a property called "Correlation Attribute Name" which when specified will merge together flow files that have the same value for the attribute specified.&lt;/P&gt;&lt;P&gt;In your scenario you first need to use EvaluateJSONPath to extract "service" and "eventName" from the JSON document. Based on your sample JSON it seems like they are at the root level of the document so I believe something like:&lt;/P&gt;&lt;PRE&gt;service = $.service
eventName = $.eventName&lt;/PRE&gt;&lt;P&gt; Then you need to get these two values into a single attribute, so you can use UpdateAttribute with something like:&lt;/P&gt;&lt;PRE&gt;serviceEventName = ${service}/${eventName}
&lt;/PRE&gt;&lt;P&gt;Then in MergeContent set the "Correlation Attribute Name" to "serviceEventName". You can also specify the minimum group size and age so that you can merge together either 100MB or 1 hour worth of data.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 00:23:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170378#M37338</guid>
      <dc:creator>bbende</dc:creator>
      <dc:date>2016-08-10T00:23:39Z</dc:date>
    </item>
    <item>
      <title>Re: Merge json events based on property</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170379#M37339</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/1952/suri-1415.html" nodeid="1952"&gt;@BigDataRocks&lt;/A&gt; - I believe that Bryan's answer above is very accurate, so this is not really intended to directly answer your question but wanted to mention that your directory structure above can be simplified to just:&lt;P&gt;eventsink/${service_type}/${event_name}/${now():format('yyyy/MM/dd/HHmmssSSS')}.${filename}.json&lt;/P&gt;&lt;P&gt;As you have it above, you are asking for "now()" multiple times would could cause some weirdness if the hour rolls over between invocations, etc. Doing it all with a single call to now() will address this and simplifies the configuration as well.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 02:45:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170379#M37339</guid>
      <dc:creator>mpayne</dc:creator>
      <dc:date>2016-08-10T02:45:15Z</dc:date>
    </item>
    <item>
      <title>Re: Merge json events based on property</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170380#M37340</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/367/mpayne.html" nodeid="367"&gt;@mpayne&lt;/A&gt; thanks for pointing it out &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 14:55:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170380#M37340</guid>
      <dc:creator>bigspark</dc:creator>
      <dc:date>2016-08-10T14:55:45Z</dc:date>
    </item>
    <item>
      <title>Re: Merge json events based on property</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170381#M37341</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/363/bbende.html" nodeid="363"&gt;@Bryan Bende&lt;/A&gt; Thanks for the answer it did work for me. Just a small config iam looking for. Currently when i merge my json events and export them to S3 iam getting concatenated json events delimited by "Space" in a single line. At the moment iam getting concatenated json events in a single line. How can i get the json events delimited by new line \n. Thank you.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 15:03:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170381#M37341</guid>
      <dc:creator>bigspark</dc:creator>
      <dc:date>2016-08-10T15:03:11Z</dc:date>
    </item>
    <item>
      <title>Re: Merge json events based on property</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170382#M37342</link>
      <description>&lt;P&gt;In MergeContent there is a Delimiter Strategy, choose "Text" which means it uses the values type in to Header, Demarcator, and Footer. The Demarcator is what gets put between each FlowFile that is merged together. You can enter a new line with shift+enter.&lt;/P&gt;</description>
      <pubDate>Wed, 10 Aug 2016 20:19:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Merge-json-events-based-on-property/m-p/170382#M37342</guid>
      <dc:creator>bbende</dc:creator>
      <dc:date>2016-08-10T20:19:08Z</dc:date>
    </item>
  </channel>
</rss>

