Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Help with UpdateRecord or QueryRecord

avatar
Contributor

Hi to all,

hope someone can point me on the right direction.

I have something like that:

[ {
    "reference" : "PB",
    "date" : "20230613",
    "barcode" : "C337287V28490010",
    "shipment_number" : ""
}, {
    "reference" : "PB",
    "date" : "20230613",
    "barcode" : "C337287V28490011",
    "shipment_number" : ""
}, {
    "reference" : "PB",
    "date" : "20230613",
    "barcode" : "C337287V28490012",
    "shipment_number" : ""
} ]

I need to update only a specific record with a met condition.

Speaking on a SQL language it should be something like that:

UPDATE FLOWFILE SET shipment_number = '${Shipment_Number_Attribute} ' WHERE barcode = '${Barcode_Attribute}'

Unfortunately this statement in QueryRecord doesn't work, maybe UPDATE not supported?

Moreover is there any way to achieve the same with UpdateRecord?

 

Many thanks for any help.

Appreciate.

 

2 ACCEPTED SOLUTIONS

avatar
Super Guru

You either have to select each column one by one to have the same column name you are trying to update in the case statement, or select * but create different column for the case statement and then use jolt transformation to write the value back to the original column.

View solution in original post

avatar
Super Guru

You have to create a separate case statement for each column you are trying to update similar to what  is done for the shipment number.

View solution in original post

13 REPLIES 13

avatar
Super Collaborator

Where is the barcode attribute coming from?

avatar
Contributor

Barcode is an attribute coming from each fragment...different on any fragment.

avatar

@Ray82 Yes, you can achieve this with UpdateRecord. You will need to provide record reader/writer with schema of your upstream and downstream.  Then in UpdateRecord you explicitly add properties (+) for each record value you want to update versus using a SQL statement like QueryRecord.  

 

Here are some useful community posts on this topic:

 

https://community.cloudera.com/t5/Community-Articles/Update-the-Contents-of-FlowFile-by-using-Update...

 

https://community.cloudera.com/t5/Support-Questions/NiFi-UpdateRecord-processor-is-not-updating-JSON...

 

avatar
Contributor

Sure I Kwon this....the question is how to add a where clause in Updare Record or QueryRecord .

 

In QueryRecord I ve tried an Update statement with no luck.

 

Will giving me always an error.

 

 

avatar
Super Guru

Hi,

 

I dont think what you are trying to do can be achieved with UpdateRecord, the reason being that the UpdateRecord you have to define a record path where the specific record\attribute have to be static and cannot take any variable like a flowfile attribute. For example lets say you want to update record with barcode "C337287V28490011" , then record path property would be defined as the following which would work :

 

/*[/barcode='C337287V28490012']/shipment_number

 

 

However, you cannot define the record path like this:

 

/*[/barcode='${Barcode_Attribute}']/shipment_number

 

 

The only way I can think of where you can accomplish this in one processor is to use the JoltTransformationJSON processor. Jolt is a transformation language for json and its very powerful. In this processor you set the "Jolt Specification" property with the following spec:

 

 

[
  {
    "operation": "shift",
    "spec": {
      "*": {
        "barcode": {
          "${Barcode_Attribute}": {
            "$": "[&3].barcode",
            "@(2,date)": "[&3].date",
            "#${Shipment_Number_Attribute}": "[&3].shipment_number",
            "@(2,reference)": "[&3].reference"
          },
          "*": {
            "$": "[&3].barcode",
            "@(2,date)": "[&3].date",
            "@(2,shipment_number)": "[&3].shipment_number",
            "@(2,reference)": "[&3].reference"
          }
        }
      }
    }
  }
]

 

 

Notice how you can use Expression Language in the Jolt Spec where you can specify flowfile attributes as well. For more information on Jolt Spec you can refer to the following:

https://jolt-demo.appspot.com/#inception

https://intercom.help/godigibee/en/articles/4044359-transformer-getting-to-know-jolt

 

If that helps please accept solution,

Thanks

 

 

 

avatar
Super Guru

Update:

Actually, I just found out that you can achieve what you need using QueryRecord processor by defining  the query Relation Property with the following query:

 

select reference,barcode,"date",
case when barcode = '${Barcode_Attribute}' then '${Shipment_Number_Attribute}'
else shipment_number end as shipment_number
from flowfile

The Json Jolt transformation above still works but its more complicated and probably less efficient.

Hope that helps

 

 

avatar
Contributor

Well, i'm not sure this query is exactly what i'm searching for.

Speaking in a human way this is what i need to do:

Search the element with barcode=${barcode_attribute} and if found update corresponding "shipment_number" value with ${shipment_number_attribute}, otherwise leave it as it is.

So this:

 

[ {
    "reference" : "PB",
    "date" : "20230613",
    "barcode" : "C337287V28490010",
    "shipment_number" : ""
}, {
    "reference" : "PB",
    "date" : "20230613",
    "barcode" : "C337287V28490011",
    "shipment_number" : ""
}, {
    "reference" : "PB",
    "date" : "20230613",
    "barcode" : "C337287V28490012",
    "shipment_number" : ""
} ]

would becomes as below:

 

[ {
    "reference" : "PB",
    "date" : "20230613",
    "barcode" : "C337287V28490010",
    "shipment_number" : "0000023/SH"
}, {
    "reference" : "PB",
    "date" : "20230613",
    "barcode" : "C337287V28490011",
    "shipment_number" : ""
}, {
    "reference" : "PB",
    "date" : "20230613",
    "barcode" : "C337287V28490012",
    "shipment_number" : ""
} ]

Trying your query seems that does not update shipment_number field.

Hope this can explain better the need.

 

Many Thanks

avatar
Contributor

@Ray82 - I tested out @SAMSAL solution quickly and it worked for me. 

 

Make sure in QueryRecord you are referencing the right attribute names.

avatar
Super Guru

Hi @Ray82 

Its hard to predict how this is going to work without understanding the full data flow you have, so I had to come up with simple flow to test the query which worked for me as expected. Please see the my flow below and each processor configuration to get the expected result:

 

Data Flow:

 

SAMSAL_0-1686923883329.png

 

1) Generate Input Josn: GenerateFlowFile processor to produce the Input json as you provided and provide the query attributes.

 

SAMSAL_1-1686924018255.png

 

2) QueryRecord: Run the Query provided above to generate expected result:

 

SAMSAL_2-1686924189484.png

 

Notice that you need to connect the QueryRelationship "QueryRel" to downstream processor to get the expected output and not  the original relationship:

 

SAMSAL_3-1686924439882.png

 

After running the DataFlow above and getting the output in the QueryRel  relationship queue , the output looks like as follows:

 

SAMSAL_4-1686924553215.png

 

Notice to get the out put as an array the JsonRecordWrite in the QueryRecord RecordWriter is configured as the following:

 

SAMSAL_5-1686924659704.png

Like I said in the beginning everything depends at the end on your overall dataflow and how you are getting the data , attributes and in what format. The details above hopefully can guide you in case you are missing something or figure out what is making this not working in your case. Also be aware that Im using version 1.20.0 so also depending what version you are using the behavior can be different.

 

Hope that helps.