Created on 06-12-2020 02:00 AM - edited 06-12-2020 02:08 AM
Hi.
I'm using NiFi 1.11.2.
I have a set of Json-files which contain a base64-coded field (Jsonpath to this field is $.data.content). Decoding the field works as expected and I'm able to write the content from this field to a file in a bucket in S3. But what I would like to do is to be able to replace the coded value for this field with the decoded value in stead of writing the decoded value to file. And after replacing the value write the updated Json-file to a new S3 bucket.
I'm thinking taking the output from RouteOnAttribute and ConvertRecord in the figure below and perhaps use UpdateRecord or ReplaceText. But I'm not sure how to do this.
What is the best way to "re-connect" the data flow here from these to objects in the process?
Any tips?
Thanks
bjornmy
Created 07-07-2020 05:36 AM
Just a short update.
I solved this by using the ExecueGroovyScript process and made a short groovy-script that took care of the transformation.
Created 06-23-2020 03:28 AM
Any input would be appreciated guys.
The output from the ConvertRecord is a JSON-document, and I would like to replace the value of $.data.content from the originale file with this JSON-document.
How can I do this?
Created 06-23-2020 05:02 AM
@bjornmy The solution you are looking for is to use updateAttribute to operate on the attribute you want to modify with the NiFi Expression Language for base64encode/base64decode. This will operate on the flowfile attribute in contrast to the Base64EncodeContent processor which acts on the flowfile content.
Assuming encoded content => $.data.content in EvaluateJson, the updateAttribute for content will look like:
${content:base64decode()}
You may need to evaluate all the attributes you need to use. Once you have the attribute(s) formatted the way you want, you then use AttributesToJson (configuration set to flowfile-content) to rebuild the json object you want to send content of the flowfile downstream to the final S3 Bucket.
I am sorry I can't be more specific on the last parts, as I did not create a test sample, and I cannot see exactly what you are doing with ReplaceText->Base64EncodeContent->ConvertRecord.
If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.
Thanks,
Steven @ DFHZ
Created on 06-24-2020 04:13 AM - edited 06-24-2020 05:24 AM
Thank you for your input @stevenmatison
Below you will find more info, and I have also enclosed a test json-file.
The main source here is JSON-documents, and as mentioned above the value in $.data.content in these files is a base64encoded XML-document. That is the reason for the ConvertRecord (see below). If it not had been for this I suspect your suggestion would work fine. The flow that I have made so far is extracting the value from $.data.content, decodes it, converts the XML to JSON and finally stores the JSON-content from this base64-field to S3. And it's the encoded data in $.data.content in the main document I would like to have replaced with the decoded/coneverted JSON-document, and then store the updated version of the main file in S3.
The EvaluateJsonPath looks like this:
The ReplaceText looks like this:
The Base64EncodeContent looks like this:
and finally the ConvertRecord looks like this:
This is what a test json-file would look like. Here you can see the $.data.content encoded value I need to replace with the corresponding decoded/converted json-value, before writing the updated content to S3:
{
"header": {
"dokumentidentifikator": null,
"dokumentidentifikatorV2": "dcff985b-c652-4085-b8f1-45a2f4b6d150",
"revisjonsnummer": 1,
"dokumentnavn": "Engangsavgiftfastsettelse:1122334455:44BIL1:2017-10-20",
"dokumenttype": "SKATTEMELDING_ENGANGSAVGIFT",
"dokumenttilstand": "OPPRETTET",
"gyldig": true,
"gjelderInntektsaar": 2017,
"gjelderPeriode": "2017_10",
"gjelderPart": {
"partsnummer": 5544332211,
"identifiseringstype": "MASKINELL",
"identifikator": null
},
"opphavspart": {
"partsnummer": 5544332211,
"identifikator": null
},
"kildereferanse": {
"kildesystem": "ENGANGSAVGIFTFASTSETTELSE",
"gruppe": "",
"referanse": "aef147fb-8ce8-43ef-833b-7aa3bac1ece0",
"tidspunkt": "2018-01-16T13:28:02.49+01:00"
}
},
"data": {
"metadata": {
"format": "motorvogn:motorvognavgift:v1",
"bytes": 4420,
"mimeType": "application/xml",
"sha1": "c0AowOsTdNdo6VufeSsZqTphc0Y="
},
"content": "PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiIHN0YW5kYWxvbmU9InllcyI/Pgo8bW90b3J2b2duYXZnaWZ0IHhtbG5zPSJza2U6ZmFzdHNldHRpbmc6bW90b3J2b2duOm1vdG9ydm9nbmF2Z2lmdDp2MSI+CiAgICA8YXZnaWZ0c2xpbmplPgogICAgICAgIDxhdmdpZnRzYmVsb2VwPjU0Mjg5Ni4wMDwvYXZnaWZ0c2JlbG9lcD4KICAgICAgICA8YXZnaWZ0c29wcGx5c25pbmc+CiAgICAgICAgICAgIDxzYWVyYXZnaWZ0VHlwZWtvZGU+QkI8L3NhZXJhdmdpZnRUeXBla29kZT4KICAgICAgICAgICAgPHNhZXJhdmdpZnRHcnVwcGVrb2RlPlg8L3NhZXJhdmdpZnRHcnVwcGVrb2RlPgogICAgICAgIDwvYXZnaWZ0c29wcGx5c25pbmc+CiAgICAgICAgPGF2Z2lmdHNkYXRvPjIwMTctMTAtMjA8L2F2Z2lmdHNkYXRvPgogICAgPC9hdmdpZnRzbGluamU+CiAgICA8YmV0YWxpbmdzaW5mb3JtYXNqb24+CiAgICAgICAgPGtpZG51bW1lcj4xMDEwMTAxMDEwMTA8L2tpZG51bW1lcj4KICAgICAgICA8Zm9yZmFsbHNkYXRvPjIwMTctMTAtMjA8L2ZvcmZhbGxzZGF0bz4KICAgICAgICA8ZmFrdHVyYWRhdG8+MjAxNy0xMC0yMDwvZmFrdHVyYWRhdG8+CiAgICAgICAgPHRvdGFsQXZnaWZ0c2JlbG9lcD41NDI4OTYuMDA8L3RvdGFsQXZnaWZ0c2JlbG9lcD4KICAgIDwvYmV0YWxpbmdzaW5mb3JtYXNqb24+CiAgICA8bW90b3J2b2duYXZnaWZ0c3R5cGU+ZW5nYW5nc2F2Z2lmdDwvbW90b3J2b2duYXZnaWZ0c3R5cGU+CiAgICA8dGlkc3N0ZW1wZWw+MjAxOC0wMS0xNiswMTowMDwvdGlkc3N0ZW1wZWw+CiAgICA8Z3J1bm5sYWdGb3JNb3RvcnZvZ25hdmdpZnQ+CiAgICAgICAgPGtqb2VyaW5nZW5zQXJ0PjEwPC9ram9lcmluZ2Vuc0FydD4KICAgICAgICA8a2pvZXJldG9leT4KICAgICAgICAgICAgPGVpZXJza2FwUmVnaXN0cmVydD4yMDE3LTEwLTIwPC9laWVyc2thcFJlZ2lzdHJlcnQ+CiAgICAgICAgICAgIDxmb2Vyc3RlUmVnaXN0cmVyaW5nc2Fhcj4yMDE3PC9mb2Vyc3RlUmVnaXN0cmVyaW5nc2Fhcj4KICAgICAgICAgICAgPGZvZXJzdGVSZWdpc3RyZXJpbmdzZGF0b0lOb3JnZT4yMDE3LTEwLTIwPC9mb2Vyc3RlUmVnaXN0cmVyaW5nc2RhdG9JTm9yZ2U+CiAgICAgICAgICAgIDxram9lcmV0b2V5Z3J1cHBlPjEwMTwva2pvZXJldG9leWdydXBwZT4KICAgICAgICAgICAgPGxlbmdkZT4zOTY0PC9sZW5nZGU+CiAgICAgICAgICAgIDxtb3RvcmVmZmVrdD45NjwvbW90b3JlZmZla3Q+CiAgICAgICAgICAgIDxzbGFndm9sdW0+MTQzPC9zbGFndm9sdW0+CiAgICAgICAgICAgIDxkcml2c3RvZmY+QkVOU0lOPC9kcml2c3RvZmY+CiAgICAgICAgICAgIDxlZ2VudmVrdD4xNTE5PC9lZ2VudmVrdD4KICAgICAgICAgICAgPGVpZXI+CiAgICAgICAgICAgICAgICA8Zm9lZHNlbHNFbGxlckRudW1tZXI+MTEyMjMzNDQ1NTwvZm9lZHNlbHNFbGxlckRudW1tZXI+CiAgICAgICAgICAgICAgICA8cGFydHNudW1tZXI+NTU0NDMzMjIxMTwvcGFydHNudW1tZXI+CiAgICAgICAgICAgICAgICA8bmF2bj5LTEFSQSBLVTwvbmF2bj4KICAgICAgICAgICAgPC9laWVyPgogICAgICAgICAgICA8dGlsbGF0dFRvdGFsdmVrdD4yMTY0PC90aWxsYXR0VG90YWx2ZWt0PgogICAgICAgICAgICA8aHlicmlkPm5laTwvaHlicmlkPgogICAgICAgICAgICA8Y28ydXRzbGlwcD4yNjg8L2NvMnV0c2xpcHA+CiAgICAgICAgICAgIDxub3h1dHNsaXBwPjU5LjQ8L25veHV0c2xpcHA+CiAgICAgICAgICAgIDxram9lcmV0b2V5aWRlbnRpZmlrYXRvcj4KICAgICAgICAgICAgICAgIDxram9lcmV0b2V5VW5pa0lkZW50aWZpa2F0b3I+QUJDREVGR0hJSjwva2pvZXJldG9leVVuaWtJZGVudGlmaWthdG9yPgogICAgICAgICAgICAgICAgPGtqZW5uZW1lcmtlPjQ0QklMMTwva2plbm5lbWVya2U+CiAgICAgICAgICAgICAgICA8dW5kZXJzdGVsbHNudW1tZXI+VU5ERVJTVEVMTDQ0PC91bmRlcnN0ZWxsc251bW1lcj4KICAgICAgICAgICAgPC9ram9lcmV0b2V5aWRlbnRpZmlrYXRvcj4KICAgICAgICA8L2tqb2VyZXRvZXk+CiAgICA8L2dydW5ubGFnRm9yTW90b3J2b2duYXZnaWZ0PgogICAgPGF2Z2lmdHNwbGlrdGlnPgogICAgICAgIDxmb2Vkc2Vsc0VsbGVyRG51bW1lcj4xMTIyMzM0NDU1PC9mb2Vkc2Vsc0VsbGVyRG51bW1lcj4KICAgICAgICA8cGFydHNudW1tZXI+NTU0NDMzMjIxMTwvcGFydHNudW1tZXI+CiAgICA8L2F2Z2lmdHNwbGlrdGlnPgogICAgPGF2Z2lmdHNrb21wb25lbnQ+CiAgICAgICAgPGtvbXBvbmVudD5DbzI8L2tvbXBvbmVudD4KICAgICAgICA8YmVsb2VwPjQ3NTMzNy4yMDwvYmVsb2VwPgogICAgPC9hdmdpZnRza29tcG9uZW50PgogICAgPGF2Z2lmdHNrb21wb25lbnQ+CiAgICAgICAgPGtvbXBvbmVudD5FZ2VudmVrdDwva29tcG9uZW50PgogICAgICAgIDxiZWxvZXA+NjA5NDUuNjQ8L2JlbG9lcD4KICAgIDwvYXZnaWZ0c2tvbXBvbmVudD4KICAgIDxhdmdpZnRza29tcG9uZW50PgogICAgICAgIDxrb21wb25lbnQ+TW90b3JlZmZla3Q8L2tvbXBvbmVudD4KICAgICAgICA8YmVsb2VwPjAuMDA8L2JlbG9lcD4KICAgIDwvYXZnaWZ0c2tvbXBvbmVudD4KICAgIDxhdmdpZnRza29tcG9uZW50PgogICAgICAgIDxrb21wb25lbnQ+Tk94PC9rb21wb25lbnQ+CiAgICAgICAgPGJlbG9lcD40MjEzLjI0PC9iZWxvZXA+CiAgICA8L2F2Z2lmdHNrb21wb25lbnQ+CiAgICA8YXZnaWZ0c2tvbXBvbmVudD4KICAgICAgICA8a29tcG9uZW50PlNsYWd2b2x1bTwva29tcG9uZW50PgogICAgICAgIDxiZWxvZXA+MC4wMDwvYmVsb2VwPgogICAgPC9hdmdpZnRza29tcG9uZW50PgogICAgPGF2Z2lmdHNrb21wb25lbnQ+CiAgICAgICAgPGtvbXBvbmVudD5DbzIgRnJhdHJla2s8L2tvbXBvbmVudD4KICAgICAgICA8YmVsb2VwPjAuMDA8L2JlbG9lcD4KICAgIDwvYXZnaWZ0c2tvbXBvbmVudD4KICAgIDxhdmdpZnRza29tcG9uZW50PgogICAgICAgIDxrb21wb25lbnQ+RWdlbnZla3QgRnJhdHJla2s8L2tvbXBvbmVudD4KICAgICAgICA8YmVsb2VwPjAuMDA8L2JlbG9lcD4KICAgIDwvYXZnaWZ0c2tvbXBvbmVudD4KICAgIDxhdmdpZnRza29tcG9uZW50PgogICAgICAgIDxrb21wb25lbnQ+TW90b3JlZmZla3QgRnJhdHJla2s8L2tvbXBvbmVudD4KICAgICAgICA8YmVsb2VwPjAuMDA8L2JlbG9lcD4KICAgIDwvYXZnaWZ0c2tvbXBvbmVudD4KICAgIDxhdmdpZnRza29tcG9uZW50PgogICAgICAgIDxrb21wb25lbnQ+Tk94IEZyYXRyZWtrPC9rb21wb25lbnQ+CiAgICAgICAgPGJlbG9lcD4wLjAwPC9iZWxvZXA+CiAgICA8L2F2Z2lmdHNrb21wb25lbnQ+CiAgICA8YXZnaWZ0c2tvbXBvbmVudD4KICAgICAgICA8a29tcG9uZW50PlNsYWd2b2x1bSBGcmF0cmVrazwva29tcG9uZW50PgogICAgICAgIDxiZWxvZXA+MC4wMDwvYmVsb2VwPgogICAgPC9hdmdpZnRza29tcG9uZW50PgogICAgPGF2Z2lmdHNrb21wb25lbnQ+CiAgICAgICAgPGtvbXBvbmVudD5DbzIgU3VtPC9rb21wb25lbnQ+CiAgICAgICAgPGJlbG9lcD40NzUzMzcuMjA8L2JlbG9lcD4KICAgIDwvYXZnaWZ0c2tvbXBvbmVudD4KICAgIDxhdmdpZnRza29tcG9uZW50PgogICAgICAgIDxrb21wb25lbnQ+RWdlbnZla3QgU3VtPC9rb21wb25lbnQ+CiAgICAgICAgPGJlbG9lcD42MDk0NS42NDwvYmVsb2VwPgogICAgPC9hdmdpZnRza29tcG9uZW50PgogICAgPGF2Z2lmdHNrb21wb25lbnQ+CiAgICAgICAgPGtvbXBvbmVudD5Nb3RvcmVmZmVrdCBTdW08L2tvbXBvbmVudD4KICAgICAgICA8YmVsb2VwPjAuMDA8L2JlbG9lcD4KICAgIDwvYXZnaWZ0c2tvbXBvbmVudD4KICAgIDxhdmdpZnRza29tcG9uZW50PgogICAgICAgIDxrb21wb25lbnQ+Tk94IFN1bTwva29tcG9uZW50PgogICAgICAgIDxiZWxvZXA+NDIxMy4yNDwvYmVsb2VwPgogICAgPC9hdmdpZnRza29tcG9uZW50PgogICAgPGF2Z2lmdHNrb21wb25lbnQ+CiAgICAgICAgPGtvbXBvbmVudD5TbGFndm9sdW0gU3VtPC9rb21wb25lbnQ+CiAgICAgICAgPGJlbG9lcD4wLjAwPC9iZWxvZXA+CiAgICA8L2F2Z2lmdHNrb21wb25lbnQ+CiAgICA8YXZnaWZ0c2tvbXBvbmVudD4KICAgICAgICA8a29tcG9uZW50PlZyYWtwYW50PC9rb21wb25lbnQ+CiAgICAgICAgPGJlbG9lcD4yNDAwPC9iZWxvZXA+CiAgICA8L2F2Z2lmdHNrb21wb25lbnQ+CiAgICA8YXZnaWZ0c2tvbXBvbmVudD4KICAgICAgICA8a29tcG9uZW50PkJydWtzZnJhZHJhZyAwJTwva29tcG9uZW50PgogICAgICAgIDxiZWxvZXA+MC4wMDwvYmVsb2VwPgogICAgPC9hdmdpZnRza29tcG9uZW50PgogICAgPGZvcmhhYW5kc2JlcmVnbmluZz5mYWxzZTwvZm9yaGFhbmRzYmVyZWduaW5nPgo8L21vdG9ydm9nbmF2Z2lmdD4K"
},
"extension": null,
"skjemaversjon": "v3_0"
}
Thanks,
bjornmy
Created on 06-24-2020 05:11 AM - edited 06-24-2020 05:25 AM
@bjornmy thanks for the updated info.
It appears like you are using replaceText to replace the content of the flowfile with just the ${inn hold}. This over writes the existing flowfile content (entire original json object) with just the encoded data.content value, which you decode, and then convert to Json.
What I have suggested is that you do this:
EvaluateJson -> UpdateAttribute (decode here) -> AttributesToJson (choose to flowfile-content) -> PutS3Object
This will eliminate the replaceText and base64Decode processors and still give you decoded xml for the content object as json. Next, you can decide to complete this same process for the rest of the original json object. For example, if you want header or metadata from the original json in the S3Object. You add these values to the EvaluateJson, get the data to attributes, then in attributesToJson you send multiple original attributes with the decoded attribute. If you complete all the data values you would end up with exact same json object w/ the decoded XML.
The way you have it now, and the way I suggest are just two ways to do this. Neither is right or wrong mine just gives you the ability to break down the entire object and rebuild the original json with your modification.
You may also want to look into JoltTransform and/or UpdateRecord. I do not have much experience with Jolt but you may find it can also accomplish a similar process and act only on data.content versus parsing large xml to an attribute.
Created on 06-24-2020 06:29 AM - edited 06-24-2020 06:31 AM
Thanks for the quick response @stevenmatison .
I am not getting the expected result (yet) in that the XML-code is in my resulting file?
I left the EvaluateJson as above.
In the UpdateAttribute I set content to:
${innhold:base64Decode()}
and in the AttributesToJson I did this:
The result is a json-fil with one attribute "content", and this attribute contains the whole XML-file.
Where did I go wrong here?
So the flow now looks like this:
Thanks,
bjornmy
Created 06-24-2020 06:40 AM
That is what I would have expected based on the explanation.
What was your expected outcome? That would help me make a further suggestion.
Created on 06-24-2020 06:47 AM - edited 06-24-2020 06:53 AM
Ah, ok.
My aim is to convert the XML to JSON and insert the converted JSON back into the original main JSON-document in place of the coded value for the content.
If you take a look at the test JSON-file in my previous comment, there you will see the base64coded-value.
I am trying to end up with a similar document but in stead of the base64coded-value I want to insert the Json equivalent of the XML-document.
On other words I would like to end up with this:
{
"header": {
"dokumentidentifikator": null,
"dokumentidentifikatorV2": "dcff985b-c652-4085-b8f1-45a2f4b6d150",
"revisjonsnummer": 1,
"dokumentnavn": "Engangsavgiftfastsettelse:1122334455:44BIL1:2017-10-20",
"dokumenttype": "SKATTEMELDING_ENGANGSAVGIFT",
"dokumenttilstand": "OPPRETTET",
"gyldig": true,
"gjelderInntektsaar": 2017,
"gjelderPeriode": "2017_10",
"gjelderPart": {
"partsnummer": 5544332211,
"identifiseringstype": "MASKINELL",
"identifikator": null
},
"opphavspart": {
"partsnummer": 5544332211,
"identifikator": null
},
"kildereferanse": {
"kildesystem": "ENGANGSAVGIFTFASTSETTELSE",
"gruppe": "",
"referanse": "aef147fb-8ce8-43ef-833b-7aa3bac1ece0",
"tidspunkt": "2018-01-16T13:28:02.49+01:00"
}
},
"data": {
"metadata": {
"format": "motorvogn:motorvognavgift:v1",
"bytes": 4420,
"mimeType": "application/xml",
"sha1": "c0AowOsTdNdo6VufeSsZqTphc0Y="
},
"content": "{
"avgiftslinje": {
"avgiftsbeloep": 542896.0,
"avgiftsopplysning": {
"saeravgiftTypekode": "BB",
"saeravgiftGruppekode": "X"
},
"avgiftsdato": "2017-10-20"
},
"betalingsinformasjon": {
"kidnummer": 101010101010,
"forfallsdato": "2017-10-20",
"fakturadato": "2017-10-20",
"totalAvgiftsbeloep": 542896.0
},
"motorvognavgiftstype": "engangsavgift",
"tidsstempel": "2018-01-16+01:00",
"grunnlagForMotorvognavgift": {
"kjoeringensArt": 10,
"kjoeretoey": {
"eierskapRegistrert": "2017-10-20",
"foersteRegistreringsaar": 2017,
"foersteRegistreringsdatoINorge": "2017-10-20",
"kjoeretoeygruppe": 101,
"lengde": 3964,
"motoreffekt": 96,
"slagvolum": 143,
"drivstoff": "BENSIN",
"egenvekt": 1519,
"eier": {
"foedselsEllerDnummer": 1122334455,
"partsnummer": 5544332211,
"navn": "KLARA KU"
},
"tillattTotalvekt": 2164,
"hybrid": "nei",
"co2utslipp": 268,
"noxutslipp": 59.4,
"kjoeretoeyidentifikator": {
"kjoeretoeyUnikIdentifikator": "ABCDEFGHIJ",
"kjennemerke": "44BIL1",
"understellsnummer": "UNDERSTELL44"
}
}
},
"avgiftspliktig": {
"foedselsEllerDnummer": 1122334455,
"partsnummer": 5544332211
},
"avgiftskomponent": [
{
"komponent": "Co2",
"beloep": 475337.2
},
{
"komponent": "Egenvekt",
"beloep": 60945.64
},
{
"komponent": "Motoreffekt",
"beloep": 0.0
},
{
"komponent": "NOx",
"beloep": 4213.24
},
{
"komponent": "Slagvolum",
"beloep": 0.0
},
{
"komponent": "Co2 Fratrekk",
"beloep": 0.0
},
{
"komponent": "Egenvekt Fratrekk",
"beloep": 0.0
},
{
"komponent": "Motoreffekt Fratrekk",
"beloep": 0.0
},
{
"komponent": "NOx Fratrekk",
"beloep": 0.0
},
{
"komponent": "Slagvolum Fratrekk",
"beloep": 0.0
},
{
"komponent": "Co2 Sum",
"beloep": 475337.2
},
{
"komponent": "Egenvekt Sum",
"beloep": 60945.64
},
{
"komponent": "Motoreffekt Sum",
"beloep": 0.0
},
{
"komponent": "NOx Sum",
"beloep": 4213.24
},
{
"komponent": "Slagvolum Sum",
"beloep": 0.0
},
{
"komponent": "Vrakpant",
"beloep": 2400
},
{
"komponent": "Bruksfradrag 0%",
"beloep": 0.0
}
],
"forhaandsberegning": false
}"
},
"extension": null,
"skjemaversjon": "v3_0"
}
Created 06-24-2020 06:55 AM
@bjornmy Ok I understand. In this case, you have to finish all of the json object values to attributes and then back to json, not just content. I was using only that one as an example, and because in the original flow you were only sending content.
I think its best at this point to investigate the other two methods (JOLTTransform or UpdateRecord) as they should require less work handling all the other values.
Created 06-24-2020 07:04 AM
Ok, thanks. I will start looking at these.
My first thought when I started this was to isolate the decoding/conversion. Then I was hoping to find a solution where I could simply replace the "content"-value from the original file with the decoded/converted value. But I never got farther than decoding/converting.