<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to repeat a json to maintain uniqueness for PutHbaseJson processor in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-repeat-a-json-to-maintain-uniqueness-for-PutHbaseJson/m-p/189969#M78171</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt; you are awesome. Thanks for the help. It does resolve my problem. I wanted to make rowkey using the combination of json attributes, in order to achieve I used updateAttribute processor and declared the key there using the combination of resource id, metric id and timestamp. Then included this rowkey in AttributesToJson.&lt;/P&gt;</description>
    <pubDate>Fri, 11 May 2018 18:32:31 GMT</pubDate>
    <dc:creator>contactvivekjai</dc:creator>
    <dc:date>2018-05-11T18:32:31Z</dc:date>
    <item>
      <title>How to repeat a json to maintain uniqueness for PutHbaseJson processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-repeat-a-json-to-maintain-uniqueness-for-PutHbaseJson/m-p/189967#M78169</link>
      <description>&lt;P&gt;
	I have flattened a json using jolt since the input was json array it has resulted in list for the keys. But to use PutHBaseJson processor I need scalar values to define rowkey. Is there a way to repeat the whole json again and keep one value at a time? So that the uniqueness is maintained.&lt;/P&gt;&lt;P&gt;
	Below are my input, transformation and output&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Input&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;{
  "resource": {
    "id": "1234",
    "name": "Resourse_Name"
  },
  "data": [
    {
      "measurement": {
        "key": "5678",
        "value": "status"
      },
      "timestamp": 1517784040000,
      "value": 1
    },
    {
      "measurement": {
        "key": "91011",
        "value": "location"
      },
      "timestamp": 1519984070000,
      "value": 0
    }
  ]
}



&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;Transformation&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;[
  {
    "operation": "shift",
    "spec": {
      "resource": {
        "id": "resource_id",
        "name": "resource_name"
      },
      //
      // Turn all the SecondaryRatings into prefixed data
      // like "rating-Design" : 4
      "data": {
        "*": {
          // the "&amp;amp;" in "rating-&amp;amp;" means go up the tree 0 levels,
          // grab what is ther and subtitute it in
          "measurement": {
            "*": "measurement_&amp;amp;"
          },
          "timestamp": "measurement_timestamp",
          "value": "value"
        }
      }
    }
  }
]

&lt;/PRE&gt;
&lt;P&gt;&lt;STRONG&gt;Output&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;{
  "resource_id" : "1234",
  "resource_name" : "Resourse_Name",
  "measurement_key" : [ "5678", "91011" ],
  "measurement_value" : [ "status", "location" ],
  "measurement_timestamp" : [ 1517784040000, 1519984070000 ],
  "value" : [ 1, 0 ]
}


&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;Expected output&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;{
  "resource_id" : "1234",
  "resource_name" : "Resourse_Name",
  "measurement_key" : "5678",
  "measurement_value" : "status",
  "measurement_timestamp" : 1517784040000, ,
  "value" :  1
},
{
  "resource_id" : "1234",
  "resource_name" : "Resourse_Name",
  "measurement_key" :  "91011" ,
  "measurement_value" :  "location" ,
  "measurement_timestamp" :  1519984070000 ,
  "value" :  0 
}




&lt;/PRE&gt;</description>
      <pubDate>Fri, 11 May 2018 06:32:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-repeat-a-json-to-maintain-uniqueness-for-PutHbaseJson/m-p/189967#M78169</guid>
      <dc:creator>contactvivekjai</dc:creator>
      <dc:date>2018-05-11T06:32:55Z</dc:date>
    </item>
    <item>
      <title>Re: How to repeat a json to maintain uniqueness for PutHbaseJson processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-repeat-a-json-to-maintain-uniqueness-for-PutHbaseJson/m-p/189968#M78170</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/73231/contactvivekjain.html" nodeid="73231" target="_blank"&gt;@vivek jain&lt;/A&gt;&lt;P&gt;Without using JOLT tranform you can achieve the same expected output with Evaluate and Split Json processors.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Example:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="72741-flow.png" style="width: 1560px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18535iFA074FF5BDFE0078/image-size/medium?v=v2&amp;amp;px=400" role="button" title="72741-flow.png" alt="72741-flow.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;i have used your input json in generate flowfile processor to test out this flow.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;EvaluateJsonPath Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Change the below property value&lt;BR /&gt;Destination
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;flowfile-attribute&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;Add new properties to the processor&lt;BR /&gt;&lt;STRONG&gt;resource_id
&lt;/STRONG&gt;&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;$.resource.id&lt;/PRE&gt;

&lt;/DIV&gt;&lt;P&gt;&lt;STRONG&gt;resource_name
&lt;/STRONG&gt;&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;$.resource.name&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;now we are extracting id,name values and assigning to resource_id,resource_name attributes.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="72743-evaljson1.png" style="width: 1703px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18536iE40EB9FBC1B95CEB/image-size/medium?v=v2&amp;amp;px=400" role="button" title="72743-evaljson1.png" alt="72743-evaljson1.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;SplitJson Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;As you are having data array split the array using split json processor.&lt;/P&gt;&lt;P&gt;Configure the processor as below&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;JsonPath Expression
&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;$.data&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;EvaluateJsonPath Configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Once we have splitted the data array we are going to have 2 flowfiles having same resource_id,resource_name attributes.&lt;/P&gt;&lt;P&gt;Change the below property value &lt;BR /&gt;Destination
&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;flowfile-attribute&lt;/PRE&gt;Add new &lt;STRONG&gt;propety as&lt;/STRONG&gt;&lt;P&gt;measurement_key
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;$.measurement.key&lt;/PRE&gt;

&lt;/DIV&gt;&lt;P&gt;measurement_timestamp
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;$.timestamp&lt;/PRE&gt;

&lt;/DIV&gt;&lt;P&gt;measurement_value
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;$.measurement.value&lt;/PRE&gt;

&lt;/DIV&gt;&lt;P&gt;value
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;$.value&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;Now we are going to extract all the values and keep them as attributes to the flowfile.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="72742-evaljson.png" style="width: 1716px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18537i3C842EE2560DA6DB/image-size/medium?v=v2&amp;amp;px=400" role="button" title="72742-evaljson.png" alt="72742-evaljson.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;AttributesToJSON processor:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Use this processor to prepare the required message&lt;/P&gt;&lt;P&gt;Attributes List
&lt;/P&gt;&lt;PRE&gt;resource_id,resource_name,measurement_key,measurement_value,measurement_timestamp,value&lt;/PRE&gt;&lt;P&gt;Now we are going to have 2 flowfiles then you can feed those flowfile to PutHBaseJson processor because puthbasejson processor expects one json message at a time, use some unique field(combination of attributes (or) ${UUID()}..etc) as rowkey value so that You are not going to overwrite the existing data in HBase.&lt;BR /&gt; if you want to merge them into 1 then use merge  content processor with defragment as merge strategy and prepare the a valid json array of two messages in it so that you can use PutHBaseRecord processor to process chunks of messages at a time.&lt;/P&gt;&lt;P&gt;I have attached my sample flow.xml save/upload and make changes as per your needs&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/72744-flatten-json-191073.xml" target="_blank"&gt;flatten-json-191073.xml&lt;/A&gt;&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;&lt;P&gt;If the Answer addressed your question, &lt;STRONG&gt;Click on Accept button below to accept the answer, &lt;/STRONG&gt;That would be great help to Community users to find solution quickly for these kind of issues.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 07:40:26 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-repeat-a-json-to-maintain-uniqueness-for-PutHbaseJson/m-p/189968#M78170</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T07:40:26Z</dc:date>
    </item>
    <item>
      <title>Re: How to repeat a json to maintain uniqueness for PutHbaseJson processor</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-repeat-a-json-to-maintain-uniqueness-for-PutHbaseJson/m-p/189969#M78171</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt; you are awesome. Thanks for the help. It does resolve my problem. I wanted to make rowkey using the combination of json attributes, in order to achieve I used updateAttribute processor and declared the key there using the combination of resource id, metric id and timestamp. Then included this rowkey in AttributesToJson.&lt;/P&gt;</description>
      <pubDate>Fri, 11 May 2018 18:32:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-repeat-a-json-to-maintain-uniqueness-for-PutHbaseJson/m-p/189969#M78171</guid>
      <dc:creator>contactvivekjai</dc:creator>
      <dc:date>2018-05-11T18:32:31Z</dc:date>
    </item>
  </channel>
</rss>

