<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Removing text before and after [ ] characters in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194210#M156270</link>
    <description>&lt;A rel="user" href="https://community.cloudera.com/users/71153/hookokching.html" nodeid="71153" target="_blank"&gt;@Kok Ching Hoo&lt;/A&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;For Method1:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;in split json processor Use JsonPath Expression like&lt;/P&gt;&lt;PRE&gt;$.['XMLFile_2234.DAILY'].dataset_12232.entry&lt;/PRE&gt;&lt;P&gt;now we are escaping period in XMLFile_2234.DAILY.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;For method2:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Increase the below properties values in you extract text processor as per your flowfile size and capture group length.&lt;/P&gt;&lt;P&gt;Maximum Buffer Size
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;1 MB&lt;/PRE&gt;
&lt;/DIV&gt;&lt;P&gt;Maximum Capture Group Length
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;1024&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64683-extracttext.png" style="width: 1459px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18120iB8C2A70919D5050C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64683-extracttext.png" alt="64683-extracttext.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 18 Aug 2019 06:50:42 GMT</pubDate>
    <dc:creator>Shu_ashu</dc:creator>
    <dc:date>2019-08-18T06:50:42Z</dc:date>
    <item>
      <title>Removing text before and after [ ] characters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194206#M156266</link>
      <description>&lt;P&gt;I have a json file and would like to just keep everything within [ ] so that i can send the flowfile into SplitJson processor and subsequently into elasticsearch. A sample of the content of the file is attached below. &lt;/P&gt;&lt;PRE&gt;{
  "XMLfile_2234": {
    "xsi:schemaLocation": "http://xml.mscibarra.com/random.xsd",
    "dataset_12232": {
      "entry": [
        {
          "record_date": "2017-03-01",
          "country": "USA",
          "funds": "100",
          
        },
        {
          "record_date": "2018-03-01",
          "country": "Chile",
          "funds": "10000",
        }
      ]
    }
  }
}&lt;/PRE&gt;&lt;P&gt;How do i remove all text and character before and after the [ ]? I would want to keep the square brackets too. I'm a complete noob with regards to regex and replacetext.&lt;/P&gt;&lt;P&gt;Thanks for the help.&lt;/P&gt;</description>
      <pubDate>Sat, 17 Mar 2018 13:55:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194206#M156266</guid>
      <dc:creator>hookokching</dc:creator>
      <dc:date>2018-03-17T13:55:10Z</dc:date>
    </item>
    <item>
      <title>Re: Removing text before and after [ ] characters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194207#M156267</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/71153/hookokching.html" nodeid="71153" target="_blank"&gt;@Kok Ching Hoo&lt;/A&gt;&lt;/P&gt;&lt;P&gt;if you want to split the json content on entry array then you don't have to use regex at all.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Method1:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Use Split Json processor with below &lt;STRONG&gt;configs&lt;/STRONG&gt;:-&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64663-splitjson.png" style="width: 1659px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18127iE7BAA082C7F303AF/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64663-splitjson.png" alt="64663-splitjson.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;JsonPath Expression
&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;$.XMLfile_2234.dataset_12232.entry&lt;/PRE&gt;&lt;P&gt;Then &lt;STRONG&gt;use split relation&lt;/STRONG&gt; from splitjson processor to connect to the next processor.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Input Json content:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;input flowfile content that feeding into split json processor.&lt;/P&gt;&lt;PRE&gt;{"XMLfile_2234": {"xsi:schemaLocation": "http://xml.mscibarra.com/random.xsd","dataset_12232": {"entry": [{"record_date": "2017-03-01","country": "USA","funds": "100"},{"record_date": "2018-03-01","country": "Chile","funds": "10000"}]}}}&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Output from Splitjson processor:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;flowfile1:-&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;{"record_date":"2017-03-01","country":"USA","funds":"100"}&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;flowfile2:-&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;{"record_date":"2018-03-01","country":"Chile","funds":"10000"}&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;(or)&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Method2:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;you can use &lt;STRONG&gt;Extract text&lt;/STRONG&gt; processor to &lt;STRONG&gt;extract the entry array&lt;/STRONG&gt; and keep that as attribute then use &lt;STRONG&gt;ReplaceText processor&lt;/STRONG&gt; to overwrite the &lt;STRONG&gt;existing content of the flowfile&lt;/STRONG&gt; with new array attribute value, then use &lt;STRONG&gt;splitjson processor to split the array&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Extract text configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Add new property to the extract text processor by clicking + sign at top right corner and then add the below property&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;array
&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;"entry": (.*])&lt;/PRE&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64664-extracttext.png" style="width: 1824px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18128i5638B50FDB7A4C22/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64664-extracttext.png" alt="64664-extracttext.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Now we are going to extract all the entry array message and keep that into array attribute to the flowfile.&lt;/P&gt;&lt;P&gt;Then use Replacetext processor to replace the contents of flowfile with array message value.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Replacetext configs:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Replacement Value&lt;/P&gt;&lt;PRE&gt;${array}&lt;/PRE&gt;&lt;P&gt;Replacement Strategy&lt;/P&gt;&lt;PRE&gt;Always Replace&lt;/PRE&gt;&lt;P&gt;Change the above property values in the replace text processor, in this processor we are writing array attribute value as contents of the flowfile.&lt;/P&gt;&lt;P&gt;now we are going to have entry array message as our flowfile content so we can use split json processor to split the array into individual messages.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Input flowfile content:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;{"XMLfile_2234": {"xsi:schemaLocation": "http://xml.mscibarra.com/random.xsd","dataset_12232": {"entry": [{"record_date": "2017-03-01","country": "USA","funds": "100"},{"record_date": "2018-03-01","country": "Chile","funds": "10000"}]}}}&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;output flowfile content:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;[{"record_date": "2017-03-01","country": "USA","funds": "100"},{"record_date": "2018-03-01","country": "Chile","funds": "10000"}]&lt;/PRE&gt;&lt;P&gt;as you can notice the output flowfile content has been changed in this processor.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;SplitJson processor:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;JsonPath Expression
&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;$.*&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;Flow:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;PRE&gt;1.Extract text processor
2.Replace Text processor
3.SplitJson processor&lt;/PRE&gt;&lt;P&gt;By following both methods output would be the same by using method 1 would be easy to complete this task.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 06:51:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194207#M156267</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T06:51:34Z</dc:date>
    </item>
    <item>
      <title>Re: Removing text before and after [ ] characters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194208#M156268</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929" target="_blank"&gt;@Shu&lt;/A&gt;,&lt;/P&gt;&lt;P&gt;Thanks for your prompt response. Your reply has given me new insights on how to handle my problems. The data which I provided initially was very simplified, as I assume that it was just a regex question. Here's a better representation of the data. There's around 50 fields for each record and around 1,500 records in each json file.&lt;/P&gt;&lt;PRE&gt;{
  "XMLFile_2234.DAILY": {
    "xsi:schemaLocation": "http://xml.mscibarra.com/random.xsd",
    "dataset_12232": {
      "entry": [
        {
          "record_date": "2017-03-01",
          "code": "233432",
          "country": "USA",
          "inter_com_value": ".STRATE",
          "country_code": "US",
          "One_code": "1",
          "Two_code": "0",
          "Three_code": "1",
          "value_code": "0",
          "big_code": "1",
          "small_code": "0",
          "mid_code": "0",
          "exist_code": "1",
          "restricted_code": "0",
          "base_flag": "0",
          "emply_count": "225",
          "unadj_reference_value": "5465.546456",
          "ref_date": "2016-05-31",
          "old_date": "2013-05-31",
          "new_date": "2014-05-31",
          "value_type": "EMTE",
          "estval_old": "2321.123543",
          "estval_new": "2354.585674",
          "world_code_type": "MTEE",
          "world_code_old": "1232.163564",
          "world_code_new": "1432.67565",
          "region_code_type": "TMRQ",
          "region_code_old": "2343.476576",
          "region_code_new": "6546.678576",
          "mkt_based_adj": "76856.325425",
          "total_sale_value_weighted": "23423.565434",
          "total_sale_value_raw": "23423.453535",
          "normalised_value_one": "1000.000000",
          "normalised_value_two": "1000.000000",
          "normalised_value_three": "1000.000000",
          "moving_value_one": "98456754.363246300000000",
          "moving_value_one_nd": "98456754.363246300000000",
          "moving_value_two": "98456754.363246300000000",
          "moving_value_two_nd": "98456754.363246300000000",
          "moving_value_three": "98456754.363246300000000",
          "moving_value_three_nd": "98456754.363246300000000",
          "moving_indice_pt_one": "0.000000000000000",
          "moving_indice_pt_one_p": "0.46789870755657",
          "moving_indice_pt_two": "0.000000000000000",
          "moving_indice_pt_two_p": "0.46789870755657",
          "moving_indice_pt_three": "0.000000000000000",
          "moving_indice_pt_three_p": "0.46789870755657",
          "moving_indice_pt_four": "0.000000000000000",
          "moving_indice_pt_four_p": "0.46789870755657"
        },
        {
          "record_date": "2017-03-01",
          "code": "236453",
          "country": "VEN",
          "inter_com_value": ".STRATE",
          "country_code": "VE",
          "One_code": "1",
          "Two_code": "0",
          "Three_code": "1",
          "value_code": "0",
          "big_code": "1",
          "small_code": "0",
          "mid_code": "0",
          "exist_code": "1",
          "restricted_code": "0",
          "base_flag": "0",
          "emply_count": "244",
          "unadj_reference_value": "5465.546456",
          "ref_date": "2016-05-31",
          "old_date": "2013-05-31",
          "new_date": "2014-05-31",
          "value_type": "EMTE",
          "estval_old": "2321.123543",
          "estval_new": "2354.585674",
          "world_code_type": "MTEE",
          "world_code_old": "1232.163564",
          "world_code_new": "1432.67565",
          "region_code_type": "TMRQ",
          "region_code_old": "2343.476576",
          "region_code_new": "6546.678576",
          "mkt_based_adj": "76856.325425",
          "total_sale_value_weighted": "23423.565434",
          "total_sale_value_raw": "23423.453535",
          "normalised_value_one": "1000.000000",
          "normalised_value_two": "1000.000000",
          "normalised_value_three": "1000.000000",
          "moving_value_one": "98456754.363246300000000",
          "moving_value_one_nd": "98456754.363246300000000",
          "moving_value_two": "98456754.363246300000000",
          "moving_value_two_nd": "98456754.363246300000000",
          "moving_value_three": "98456754.363246300000000",
          "moving_value_three_nd": "98456754.363246300000000",
          "moving_indice_pt_one": "0.000000000000000",
          "moving_indice_pt_one_p": "0.46789870755657",
          "moving_indice_pt_two": "0.000000000000000",
          "moving_indice_pt_two_p": "0.46789870755657",
          "moving_indice_pt_three": "0.000000000000000",
          "moving_indice_pt_three_p": "0.46789870755657",
          "moving_indice_pt_four": "0.000000000000000",
          "moving_indice_pt_four_p": "0.46789870755657"
        }
      ]
    }
  }
}&lt;/PRE&gt;&lt;P&gt;I've tried both your suggested methods. For Method 1, I can't construct a Json Expression that allows me to include the entire string of "XMLFile_2234.DAILY" as part of the expression mainly due to the period in the string. Also, if the header value changes from file to file, can I assume that this method may not be suitable?&lt;/P&gt;&lt;P&gt;For Method 2, the ExtractText processor does not seem to be able to extract any value by using the following.&lt;/P&gt;&lt;PRE&gt;"entry":(.*])&lt;/PRE&gt;&lt;P&gt;Instead I tried the following expression.&lt;/P&gt;&lt;PRE&gt;\[([^]]+)\]&lt;/PRE&gt;&lt;P&gt;And I got the following values in the attributes.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64678-screenshot-from-2018-03-18-10-51-55.png" style="width: 767px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18125i8FB5074B24602131/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64678-screenshot-from-2018-03-18-10-51-55.png" alt="64678-screenshot-from-2018-03-18-10-51-55.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64677-screenshot-from-2018-03-18-10-47-54.png" style="width: 764px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18126iA6360AC0A32949A1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64677-screenshot-from-2018-03-18-10-47-54.png" alt="64677-screenshot-from-2018-03-18-10-47-54.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;The values seems to be truncated and the 2nd record is not picked up. Also, if I have around 1,500 records within each Json file that need to be split, will this method of using attributes have any limitations?&lt;/P&gt;&lt;P&gt;Thanks &lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 06:51:20 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194208#M156268</guid>
      <dc:creator>hookokching</dc:creator>
      <dc:date>2019-08-18T06:51:20Z</dc:date>
    </item>
    <item>
      <title>Re: Removing text before and after [ ] characters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194209#M156269</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/71153/hookokching.html" nodeid="71153" target="_blank"&gt;@Kok Ching Hoo&lt;/A&gt;&lt;P&gt;First thing first! Treat JSON as JSON and not as plain text. Stop extracting text! &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Now let's talk about the solution!&lt;/P&gt;&lt;P&gt;You have a JSON whose structure looks like this.&lt;/P&gt;&lt;PRE&gt;{
"XMLfile_2234":{
 "xsi:schemaLocation":"http://xml.mscibarra.com/random.xsd",
 "dataset_12232":{
  "entry":[]
  }
 }
}&lt;/PRE&gt;&lt;P&gt;You want to pick entry column, which is an array out of it and then split individual array elements into separate docs so that you can ultimately push them to elastic search.&lt;/P&gt;&lt;P&gt;So here is the step by step solution!&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;1.&lt;/STRONG&gt; Assuming that values after XMLfile_ and dataset_ may differ for different documents, even if they don't this solution will work but since this may happen in a lot of cases, taking that case into consideration also. First of all, read the JSON document and cherry-pick only the entry column of it. How to do that? &lt;STRONG&gt;JoltTransformJSON&lt;/STRONG&gt; is the best processor in NiFi to do any JSON operations. Follows the details on your JoltTransformJSON processor configuration.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64680-screen-shot-2018-03-18-at-21300-am.png" style="width: 1558px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18121i7F8D441C9EB84F7E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64680-screen-shot-2018-03-18-at-21300-am.png" alt="64680-screen-shot-2018-03-18-at-21300-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Your complete Jolt specification&lt;/P&gt;&lt;PRE&gt;[ { 
 "operation": "shift", 
 "spec": { 
  "XMLFile_*.DAILY": { 
   "*": { 
    "entry": "entry" 
   } 
  } 
 } 
} ]
&lt;BR /&gt;&lt;/PRE&gt;&lt;P&gt;This will give you only the entry column from your data.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;2.&lt;/STRONG&gt; Now since you have just the entry column, simply use the &lt;STRONG&gt;SplitJSON&lt;/STRONG&gt; processor to split the entry, an array, into individual documents. Follows the snapshot of the SplitJSON processor configuration.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64679-screen-shot-2018-03-18-at-21558-am.png" style="width: 1588px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18122i3A9AC9FF626DBD48/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64679-screen-shot-2018-03-18-at-21558-am.png" alt="64679-screen-shot-2018-03-18-at-21558-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;3.&lt;/STRONG&gt; The &lt;STRONG&gt;split relation&lt;/STRONG&gt; will have your individualized array elements as separate flow files. A sample snapshot from your data after the data you provided in your answer went through &lt;STRONG&gt;SplitJSON&lt;/STRONG&gt; processor.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64681-screen-shot-2018-03-18-at-21834-am.png" style="width: 2982px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18123i315278AC03E42978/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64681-screen-shot-2018-03-18-at-21834-am.png" alt="64681-screen-shot-2018-03-18-at-21834-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;An individual array element in the data. Now a flow file.&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64682-screen-shot-2018-03-18-at-22024-am.png" style="width: 1250px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18124i6020C84363FB7081/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64682-screen-shot-2018-03-18-at-22024-am.png" alt="64682-screen-shot-2018-03-18-at-22024-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;EM&gt;Once you have these steps in your flow, the data out of the SplitJSON processor, specifically the split relation of SplitJSON processor, you can re-route it further as per your use case.&lt;/EM&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Hope that helps!&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 06:51:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194209#M156269</guid>
      <dc:creator>RahulSoni</dc:creator>
      <dc:date>2019-08-18T06:51:07Z</dc:date>
    </item>
    <item>
      <title>Re: Removing text before and after [ ] characters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194210#M156270</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/71153/hookokching.html" nodeid="71153" target="_blank"&gt;@Kok Ching Hoo&lt;/A&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;For Method1:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;in split json processor Use JsonPath Expression like&lt;/P&gt;&lt;PRE&gt;$.['XMLFile_2234.DAILY'].dataset_12232.entry&lt;/PRE&gt;&lt;P&gt;now we are escaping period in XMLFile_2234.DAILY.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;U&gt;For method2:-&lt;/U&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Increase the below properties values in you extract text processor as per your flowfile size and capture group length.&lt;/P&gt;&lt;P&gt;Maximum Buffer Size
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;1 MB&lt;/PRE&gt;
&lt;/DIV&gt;&lt;P&gt;Maximum Capture Group Length
&lt;/P&gt;&lt;DIV&gt;&lt;PRE&gt;1024&lt;/PRE&gt;&lt;/DIV&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64683-extracttext.png" style="width: 1459px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18120iB8C2A70919D5050C/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64683-extracttext.png" alt="64683-extracttext.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 06:50:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194210#M156270</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T06:50:42Z</dc:date>
    </item>
    <item>
      <title>Re: Removing text before and after [ ] characters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194211#M156271</link>
      <description>&lt;P&gt;Thanks &lt;A rel="user" href="https://community.cloudera.com/users/66220/rsoni.html" nodeid="66220"&gt;@Rahul Soni&lt;/A&gt;, the JoltTransformJson processor works for me.  &lt;/P&gt;&lt;P&gt;Also thanks to &lt;A rel="user" href="https://community.cloudera.com/users/18929/yaswanthmuppireddy.html" nodeid="18929"&gt;@Shu&lt;/A&gt; for explaining everything.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Mar 2018 17:36:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194211#M156271</guid>
      <dc:creator>hookokching</dc:creator>
      <dc:date>2018-03-18T17:36:00Z</dc:date>
    </item>
    <item>
      <title>Re: Removing text before and after [ ] characters</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194212#M156272</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/71153/hookokching.html" nodeid="71153" target="_blank"&gt;@Kok Ching Hoo&lt;/A&gt;&lt;P&gt;Even you &lt;STRONG&gt;don't need to use jolt transform processor&lt;/STRONG&gt; to get only the entry array as the flow file content.&lt;/P&gt;&lt;P&gt;We can achieve the same result by using &lt;STRONG&gt;split json processor &lt;/STRONG&gt;in more easy way.&lt;/P&gt;&lt;P&gt;Configure &lt;STRONG&gt;split json processor&lt;/STRONG&gt; as&lt;/P&gt;&lt;P&gt;JsonPath Expression &lt;/P&gt;&lt;PRE&gt;$.*.*.*.*&lt;/PRE&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="64699-splitjson.png" style="width: 1245px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/18119i3D9E3CB6551CF08A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="64699-splitjson.png" alt="64699-splitjson.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;By using above json path expression it &lt;STRONG&gt;doesn't matter&lt;/STRONG&gt; even &lt;STRONG&gt;if header value changed,array entry &lt;/STRONG&gt;has been&lt;STRONG&gt; changed&lt;/STRONG&gt; to &lt;STRONG&gt;entry1,exit&lt;/STRONG&gt; ... etc until you are having same structure of the json message(same dependency will be &lt;STRONG&gt;applicable by using jolt also&lt;/STRONG&gt;), this method will work we are going to split the array it self and then use the splits relation to connect to the next processors.&lt;/P&gt;&lt;P&gt;If you want to do dynamically without any dependencies on the attribute names that are going to be defined in the incoming json message/object then go with this approach.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 06:50:34 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Removing-text-before-and-after-characters/m-p/194212#M156272</guid>
      <dc:creator>Shu_ashu</dc:creator>
      <dc:date>2019-08-18T06:50:34Z</dc:date>
    </item>
  </channel>
</rss>

