<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Pandas_udf with a tuple? (pyspark) in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Pandas-udf-with-a-tuple-pyspark/m-p/190144#M152233</link>
    <description>&lt;P&gt;Hey Bryan thanks so much for taking the time! I think I'm almost there! The hint about the unicode issue helping me get past the first slew of errors. I seem to be running into a length one now however:&lt;/P&gt;&lt;PRE&gt;@pandas_udf("array&amp;lt;string&amp;gt;")
def stringClassifier(lookupstring, first, last):

    lookupstring = lookupstring.to_string().encode("utf-8")
    first = first.to_string().encode("utf-8")
    last = last.to_string().encode("utf-8")
    
	#this part takes the 3 strings above and reaches out to another library to do a string match
    result = process.extract(lookupstring, lookup_list, limit=4000)
    match_list = [item for item in result if item[0].startswith(first) and item[0].endswith(last)]
    result2 = process.extractOne(lookupstring, match_list)

    if result2 is not None and result2[0][1] &amp;gt; 75:
        answer = pd.Series(list(result2[0]))
        return answer
    else:
        fail = ["N/A","0"]
        return pd.Series(fail)
&lt;/PRE&gt;&lt;P&gt;&lt;STRONG&gt;RuntimeError: Result vector from pandas_udf was not the required length: expected 1, got 2&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;I'm initially passing three strings as variables to the function which then get passed to another library. The result is a tuple which I covert to a list then to a pandas Series object. I'm curious how I can make a 2 item array object a length of 1 ..? I'm obviously missing some basics here.&lt;/P&gt;&lt;P&gt;@Bryan C&lt;/P&gt;</description>
    <pubDate>Fri, 13 Jul 2018 05:22:44 GMT</pubDate>
    <dc:creator>alexander_witte</dc:creator>
    <dc:date>2018-07-13T05:22:44Z</dc:date>
  </channel>
</rss>

