Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Getting a referencing error on Execute Script.

avatar
Contributor

My goal is to pass the data coming in as a JSON to my script and update it to UNIX time. using python. but it is saying line 63 is being referenced before called, "json_data". This is at the session writer near the bottom


Processors being used is Convert Record -Execute Script--.


The message being passed looks like this:

json_data =

 

{"Year":2018,"DOY":12,"Hour":20,"HGI_Lat_of_the_S/C":7.0,"IMF_B_scalar_nT":1.03,"SW_Plasma_Speed_KMs":441.0}

 

and should pop out like this

 

{"HGI_Lat_of_the_S/C": 7.0, "IMF_B_scalar_nT": 1.03, "SW_Plasma_Speed_KMs": 441.0, "Unix_time": 1542074400.0}

 

Error message:

 

ExecuteScript[id=01711026-9fd7-1476-8c67-e730c921e164] Failed to process session due to javax.script.ScriptException: UnboundLocalError: local variable 'json_data' referenced before assignment in <script> at line number 63: org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException: UnboundLocalError: local variable 'json_data' referenced before assignment in <script> at line number 63

 

Python code:

 

import json
from datetime import datetime
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback

class PyStreamCallback(StreamCallback):
    def __init__(self):
        self.year=[]
        self.day=[]
        self.hour=[]
        self.month=[]
        self.key1_year = 'Year'
        self.key4_month = 'month'
        self.key2_day = 'DOY'
        self.key3_hour = 'Hour'

    # Write bytes that are utf-8 encoded chine word.
    def process(self, inputStream, outputStream):
        text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
        #obj = json.loads(text)
        #json_data = json.dumps(text)
        json_data = json.loads(json_data)
        try:
            self.year.append(json_data[self.key1_year])
        except KeyError:
            self.year.append(2000)
        try:
            self.month.append(json_data[self.key4_month])
        except KeyError:
            self.month.append(11)
        try:
            self.day.append(json_data[self.key2_day])
        except KeyError:
            self.day.append(11)
        try:
            self.hour.append(json_data[self.key3_hour])
        except KeyError:
            self.hour.append(11)

        new = year+month+day+hour
        # Considering date is in mm/dd/yyyy format
        #converting the appendd list to strings instead of ints
        b=[str(x) for x in new]
        #joining all the data without adding
        b = '/'.join(b)
        #convert to unix
        dt_object2 = datetime.strptime(b, "%Y/%m/%d/%H") 
        timestamp = datetime.timestamp(dt_object2)
        json_data.update({'Unix_time':timestamp})

        #deleting unwanted data from the dict
        for func in [self.key1_year,self.key4_month,self.key2_day,self.key3_hour]:
            try: 
                del json_data[func]
            except KeyError as e:
                pass
        outputStream.write(bytearray(json.dumps(json_data).encode('utf-8')))

flowFile = session.get()
if (flowFile != None):
    flowFile = session.write(flowFile, PyStreamCallback())
    session.transfer(flowFile, REL_SUCCESS)

 

1 ACCEPTED SOLUTION

avatar
Master Collaborator

This is the offending code:

new = year+month+day+hour
# Considering date is in mm/dd/yyyy format
#converting the appendd list to strings instead of ints
b=[str(x) for x in new]
#joining all the data without adding
b = '/'.join(b)
#convert to unix
dt_object2 = datetime.strptime(b, "%Y/%m/%d/%H") 

It looks like at some point the values of year, month, day, hour are set to strings "Year", "month", "DOY", "Hour".  Then when new = year+month+day+hour is called, the strings get concatenated into "YearmonthDOYHour".  You then split and join that string so that's why you see a '/' character between each character in the Python error message. I'll leave it you to debug this, as I've lost track of all the code changes at this point. 

 

Also note that the incoming data may be providing you with Day of Year (DOY) instead of day of month, which is what %d. You may need to use %j to parse that out with zero padding (see documentation here).

 

If this is helpful, don't forget to give kudos or accept solution. 

View solution in original post

5 REPLIES 5

avatar
Master Collaborator

Python is complaining about this line, most likely.

#json_data = json.dumps(text)
json_data = json.loads(json_data)

Why is the initial assignment commented out? Without it you have a circular assignment for json_data, and Python doesn't know what to do.

avatar
Contributor

Just getting this with that updated. 

 

TypeError: unicode indices must be integers in <script> at line number 63; rolling back session: org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException:

 

shifted code to say something like 

 

try:
    self.month.append(json_data[1])
except KeyError:
    self.month.append(11)

 

errors continue

avatar
Master Collaborator

Once you call IOUtils.toString, you get the text variable containing your message(s). Then it is appropriate to call json.loads on that text variable, as that is the function that will convert text json structure to a callable python object. Should be something like this:

text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
json_data = json.loads(text)

After this you should be able to access the elements of the json with:

json_data['Year']

Let me know if that works.

avatar
Contributor

Thanks for spotting the mistake I made with the 'text' file.

 

after making your suggested corrections:

text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
json_data = json.loads(text)
try:
    self.year.append(json_data['Year'])
except KeyError:
    self.year.append(2000)

 

I'm still incurring the following error:

ExecuteScript[id=01711026-9fd7-1476-8c67-e730c921e164] ExecuteScript[id=01711026-9fd7-1476-8c67-e730c921e164] failed to process due to javax.script.ScriptException: ValueError: time data 'Y/e/a/r/m/o/n/t/h/D/O/Y/H/o/u/r' does not match format '%Y/%m/%d/%H' in <script> at line number 63; rolling back session: org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException: ValueError: time data 'Y/e/a/r/m/o/n/t/h/D/O/Y/H/o/u/r' does not match format '%Y/%m/%d/%H' in <script> at line number 63

 

avatar
Master Collaborator

This is the offending code:

new = year+month+day+hour
# Considering date is in mm/dd/yyyy format
#converting the appendd list to strings instead of ints
b=[str(x) for x in new]
#joining all the data without adding
b = '/'.join(b)
#convert to unix
dt_object2 = datetime.strptime(b, "%Y/%m/%d/%H") 

It looks like at some point the values of year, month, day, hour are set to strings "Year", "month", "DOY", "Hour".  Then when new = year+month+day+hour is called, the strings get concatenated into "YearmonthDOYHour".  You then split and join that string so that's why you see a '/' character between each character in the Python error message. I'll leave it you to debug this, as I've lost track of all the code changes at this point. 

 

Also note that the incoming data may be providing you with Day of Year (DOY) instead of day of month, which is what %d. You may need to use %j to parse that out with zero padding (see documentation here).

 

If this is helpful, don't forget to give kudos or accept solution.