Created 04-14-2020 03:34 PM
My goal is to pass the data coming in as a JSON to my script and update it to UNIX time. using python. but it is saying line 63 is being referenced before called, "json_data". This is at the session writer near the bottom
Processors being used is Convert Record -Execute Script--.
The message being passed looks like this:
json_data =
{"Year":2018,"DOY":12,"Hour":20,"HGI_Lat_of_the_S/C":7.0,"IMF_B_scalar_nT":1.03,"SW_Plasma_Speed_KMs":441.0}
and should pop out like this
{"HGI_Lat_of_the_S/C": 7.0, "IMF_B_scalar_nT": 1.03, "SW_Plasma_Speed_KMs": 441.0, "Unix_time": 1542074400.0}
Error message:
ExecuteScript[id=01711026-9fd7-1476-8c67-e730c921e164] Failed to process session due to javax.script.ScriptException: UnboundLocalError: local variable 'json_data' referenced before assignment in <script> at line number 63: org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException: UnboundLocalError: local variable 'json_data' referenced before assignment in <script> at line number 63
Python code:
import json
from datetime import datetime
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
class PyStreamCallback(StreamCallback):
def __init__(self):
self.year=[]
self.day=[]
self.hour=[]
self.month=[]
self.key1_year = 'Year'
self.key4_month = 'month'
self.key2_day = 'DOY'
self.key3_hour = 'Hour'
# Write bytes that are utf-8 encoded chine word.
def process(self, inputStream, outputStream):
text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
#obj = json.loads(text)
#json_data = json.dumps(text)
json_data = json.loads(json_data)
try:
self.year.append(json_data[self.key1_year])
except KeyError:
self.year.append(2000)
try:
self.month.append(json_data[self.key4_month])
except KeyError:
self.month.append(11)
try:
self.day.append(json_data[self.key2_day])
except KeyError:
self.day.append(11)
try:
self.hour.append(json_data[self.key3_hour])
except KeyError:
self.hour.append(11)
new = year+month+day+hour
# Considering date is in mm/dd/yyyy format
#converting the appendd list to strings instead of ints
b=[str(x) for x in new]
#joining all the data without adding
b = '/'.join(b)
#convert to unix
dt_object2 = datetime.strptime(b, "%Y/%m/%d/%H")
timestamp = datetime.timestamp(dt_object2)
json_data.update({'Unix_time':timestamp})
#deleting unwanted data from the dict
for func in [self.key1_year,self.key4_month,self.key2_day,self.key3_hour]:
try:
del json_data[func]
except KeyError as e:
pass
outputStream.write(bytearray(json.dumps(json_data).encode('utf-8')))
flowFile = session.get()
if (flowFile != None):
flowFile = session.write(flowFile, PyStreamCallback())
session.transfer(flowFile, REL_SUCCESS)
Created 04-15-2020 08:10 AM
This is the offending code:
new = year+month+day+hour
# Considering date is in mm/dd/yyyy format
#converting the appendd list to strings instead of ints
b=[str(x) for x in new]
#joining all the data without adding
b = '/'.join(b)
#convert to unix
dt_object2 = datetime.strptime(b, "%Y/%m/%d/%H")
It looks like at some point the values of year, month, day, hour are set to strings "Year", "month", "DOY", "Hour". Then when new = year+month+day+hour is called, the strings get concatenated into "YearmonthDOYHour". You then split and join that string so that's why you see a '/' character between each character in the Python error message. I'll leave it you to debug this, as I've lost track of all the code changes at this point.
Also note that the incoming data may be providing you with Day of Year (DOY) instead of day of month, which is what %d. You may need to use %j to parse that out with zero padding (see documentation here).
If this is helpful, don't forget to give kudos or accept solution.
Created 04-14-2020 04:32 PM
Python is complaining about this line, most likely.
#json_data = json.dumps(text)
json_data = json.loads(json_data)
Why is the initial assignment commented out? Without it you have a circular assignment for json_data, and Python doesn't know what to do.
Created on 04-15-2020 06:43 AM - edited 04-15-2020 07:08 AM
Just getting this with that updated.
TypeError: unicode indices must be integers in <script> at line number 63; rolling back session: org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException:
shifted code to say something like
try:
self.month.append(json_data[1])
except KeyError:
self.month.append(11)
errors continue
Created 04-15-2020 07:37 AM
Once you call IOUtils.toString, you get the text variable containing your message(s). Then it is appropriate to call json.loads on that text variable, as that is the function that will convert text json structure to a callable python object. Should be something like this:
text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
json_data = json.loads(text)
After this you should be able to access the elements of the json with:
json_data['Year']
Let me know if that works.
Created 04-15-2020 07:54 AM
Thanks for spotting the mistake I made with the 'text' file.
after making your suggested corrections:
text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
json_data = json.loads(text)
try:
self.year.append(json_data['Year'])
except KeyError:
self.year.append(2000)
I'm still incurring the following error:
ExecuteScript[id=01711026-9fd7-1476-8c67-e730c921e164] ExecuteScript[id=01711026-9fd7-1476-8c67-e730c921e164] failed to process due to javax.script.ScriptException: ValueError: time data 'Y/e/a/r/m/o/n/t/h/D/O/Y/H/o/u/r' does not match format '%Y/%m/%d/%H' in <script> at line number 63; rolling back session: org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException: ValueError: time data 'Y/e/a/r/m/o/n/t/h/D/O/Y/H/o/u/r' does not match format '%Y/%m/%d/%H' in <script> at line number 63
Created 04-15-2020 08:10 AM
This is the offending code:
new = year+month+day+hour
# Considering date is in mm/dd/yyyy format
#converting the appendd list to strings instead of ints
b=[str(x) for x in new]
#joining all the data without adding
b = '/'.join(b)
#convert to unix
dt_object2 = datetime.strptime(b, "%Y/%m/%d/%H")
It looks like at some point the values of year, month, day, hour are set to strings "Year", "month", "DOY", "Hour". Then when new = year+month+day+hour is called, the strings get concatenated into "YearmonthDOYHour". You then split and join that string so that's why you see a '/' character between each character in the Python error message. I'll leave it you to debug this, as I've lost track of all the code changes at this point.
Also note that the incoming data may be providing you with Day of Year (DOY) instead of day of month, which is what %d. You may need to use %j to parse that out with zero padding (see documentation here).
If this is helpful, don't forget to give kudos or accept solution.