question Re: Pandas_udf with a tuple? (pyspark) in Support Questions

question Re: Pandas_udf with a tuple? (pyspark) in Support Questions https://community.cloudera.com/t5/Support-Questions/Pandas-udf-with-a-tuple-pyspark/m-p/190143#M152232 <P>It looks like you are using a scalar pandas_udf type, which doesn't support returning structs currently. I believe the return type you want is an array of strings, which is supported, so this should work. Try this:</P><PRE>@pandas_udf("array<string>") def stringClassifier(x,y,z): # return a pandas series of a list of strings, that is same length as input - for example s = pd.Series([[u"a", u"b"]] * len(x)) return s</PRE><P>If you are using Python 2, make sure your strings are in unicode otherwise they might get interpreted as bytes. Hope that helps!</P> Fri, 13 Jul 2018 01:13:47 GMT o912451 2018-07-13T01:13:47Z