<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: trying to get the most basic python UDFs working in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396952#M249663</link>
    <description>&lt;P&gt;Hmmm...so I tried rolling back to Pig 13, and somewhat troubling...but that totally worked.&amp;nbsp; On multiple different machines.&amp;nbsp; Perhaps something didn't get tested real well before release?&lt;/P&gt;</description>
    <pubDate>Mon, 04 Nov 2024 14:06:46 GMT</pubDate>
    <dc:creator>mew</dc:creator>
    <dc:date>2024-11-04T14:06:46Z</dc:date>
    <item>
      <title>trying to get the most basic python UDFs working</title>
      <link>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396516#M249454</link>
      <description>&lt;P&gt;pig-0.17.0bin/pig&amp;nbsp; -x local&lt;/P&gt;&lt;P&gt;very basic UDF file:&lt;/P&gt;&lt;P&gt;#!/usr/bin/python3&lt;/P&gt;&lt;P&gt;from pig_util import outputSchema&lt;/P&gt;&lt;P&gt;@outputSchema("as:int")&lt;BR /&gt;def square(num):&lt;BR /&gt;if num == None:&lt;BR /&gt;return None&lt;BR /&gt;return ((num) * (num))&lt;/P&gt;&lt;P&gt;@outputSchema("word:chararray")&lt;BR /&gt;def concat(word):&lt;BR /&gt;return word + word&lt;/P&gt;&lt;P&gt;Exceedingly simple pig script:&lt;/P&gt;&lt;P&gt;REGISTER '/home/scs/woodcock/SD411/lab_udf/test.py' USING org.apache.pig.scripting.streaming.Python.PythonScriptEngine AS myFuncs;&lt;/P&gt;&lt;P&gt;A = LOAD '/home/scs/woodcock/SD411/DATA/accident.csv' USING PigStorage(',') AS (state:int,name:chararray);&lt;/P&gt;&lt;P&gt;B = FOREACH A GENERATE myFuncs.square(state) AS state, name;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If I do a "DUMP A" I get exactly what I would expect.&lt;/P&gt;&lt;P&gt;But, on a "DUMP B", I get a failed job:&lt;/P&gt;&lt;P&gt;java.lang.Exception: org.apache.pig.impl.streaming.StreamingUDFException: LINE :&lt;BR /&gt;at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)&lt;BR /&gt;at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)&lt;BR /&gt;Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE :&lt;BR /&gt;at org.apache.pig.impl.builtin.StreamingUDF$ProcessErrorThread.run(StreamingUDF.java:506)&lt;/P&gt;&lt;P&gt;grunt&amp;gt; Exception in thread "Thread-82" java.lang.NullPointerException: Cannot invoke "java.util.concurrent.BlockingQueue.put(Object)" because the return value of "org.apache.pig.impl.builtin.StreamingUDF.access$500(org.apache.pig.impl.builtin.StreamingUDF)" is null&lt;BR /&gt;at org.apache.pig.impl.builtin.StreamingUDF$ProcessOutputThread.run(StreamingUDF.java:471)&lt;BR /&gt;2024-10-29 13:02:15,296 [communication thread] INFO org.apache.hadoop.mapred.LocalJobRunner - map &amp;gt; map&lt;/P&gt;&lt;P&gt;?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 29 Oct 2024 17:10:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396516#M249454</guid>
      <dc:creator>mew</dc:creator>
      <dc:date>2024-10-29T17:10:19Z</dc:date>
    </item>
    <item>
      <title>Re: trying to get the most basic python UDFs working</title>
      <link>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396521#M249455</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/119913"&gt;@mew&lt;/a&gt;&amp;nbsp;Welcome to the Cloudera Community!&lt;BR /&gt;&lt;BR /&gt;To help you get the best possible solution, I have tagged our MapReduce experts&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/23549"&gt;@Stella Tang&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/84931"&gt;@vchalla&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/80493"&gt;@jeniferA&lt;/a&gt;&amp;nbsp;&amp;nbsp;who may be able to assist you further.&lt;BR /&gt;&lt;BR /&gt;Please keep us updated on your post, and we hope you find a satisfactory solution to your query.&lt;/P&gt;</description>
      <pubDate>Tue, 29 Oct 2024 18:40:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396521#M249455</guid>
      <dc:creator>DianaTorres</dc:creator>
      <dc:date>2024-10-29T18:40:31Z</dc:date>
    </item>
    <item>
      <title>Re: trying to get the most basic python UDFs working</title>
      <link>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396665#M249483</link>
      <description>&lt;P&gt;Honestly, at this point, I would probably accept any (less trivial than a "HelloWorld"--that is something that actually computes, not just returns a fixed string) Python UDF and the script that will work in pig17.&amp;nbsp; I feel like I'm just cutting and pasting the standard documented examples, and that's not close to working, which isn't giving me a great feeling.&lt;/P&gt;</description>
      <pubDate>Thu, 31 Oct 2024 15:46:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396665#M249483</guid>
      <dc:creator>mew</dc:creator>
      <dc:date>2024-10-31T15:46:45Z</dc:date>
    </item>
    <item>
      <title>Re: trying to get the most basic python UDFs working</title>
      <link>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396715#M249493</link>
      <description>&lt;P&gt;I also note that I can get Java UDFs to work; so its not a general UDF problem...it's something specific to Python.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 01 Nov 2024 17:05:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396715#M249493</guid>
      <dc:creator>mew</dc:creator>
      <dc:date>2024-11-01T17:05:18Z</dc:date>
    </item>
    <item>
      <title>Re: trying to get the most basic python UDFs working</title>
      <link>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396952#M249663</link>
      <description>&lt;P&gt;Hmmm...so I tried rolling back to Pig 13, and somewhat troubling...but that totally worked.&amp;nbsp; On multiple different machines.&amp;nbsp; Perhaps something didn't get tested real well before release?&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2024 14:06:46 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/396952#M249663</guid>
      <dc:creator>mew</dc:creator>
      <dc:date>2024-11-04T14:06:46Z</dc:date>
    </item>
    <item>
      <title>Re: trying to get the most basic python UDFs working</title>
      <link>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/397021#M249676</link>
      <description>&lt;P&gt;Here is the actual fix (it's actually quite loony):&amp;nbsp; don't wrap the name of the file in the REGISTER statement with single quotes.&amp;nbsp; That's it.&lt;/P&gt;&lt;P&gt;Catastrophic problems here:&lt;/P&gt;&lt;P&gt;1) Obviously not backwards compatible.&lt;/P&gt;&lt;P&gt;2) If this is a problem, why not just indicate that a) the format is wrong, or b) that a path that started with a single-quote did not yield a valid python file or c) anything understandable instead of getting in the middle of the M/R computation and throwing wacky (mkey?&amp;nbsp; nullPointer) errors.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Nov 2024 20:09:42 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/trying-to-get-the-most-basic-python-UDFs-working/m-p/397021#M249676</guid>
      <dc:creator>mew</dc:creator>
      <dc:date>2024-11-05T20:09:42Z</dc:date>
    </item>
  </channel>
</rss>

