Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Python HIVE UDF (Import os) module missing

avatar
New Contributor

Hi 

 

I have completed my DE 575 exam today (8/16) at 6pm CST. There was a question where I have created an python UDF for HIVE

 

The program didn't work because sys.stdin didn't work. 

 

I used 

#! usr/bin/python ---> As mentioned in the exam

import os

 

for line in sys.stdin:

    <Code>

 

When I run the Hive query the streaming didn't happen.

I got an 'error:2000' at MapReduce program from Hive Query.

 

 

I tried the same program standalone with 'cat' command on the environment and it didn't work.

 

I lost lot of time in debugging, and unable to solve the problem. 

 

I would like the Cloudera team to let me know whether "import os" module is actually present or not.

 

Without that module the HIVE UDF in python doesn't work

 

Ref:

1) I performed operations as per this Blog

http://spryinc.com/blog/guide-user-defined-functions-apache-hive

 

2) Got an error after trying somewhat similar to this and got a problem somewhat similar

http://stackoverflow.com/questions/32032154/apache-hive-getting-error-while-using-python-udf

 

3) Get an IO error like the one below if I recall correctly

http://grepcode.com/file/repo1.maven.org/maven2/co.cask.cdap/hive-exec/0.13.0/org/apache/hadoop/hive...

 

SCRIPT_IO_ERROR(20001, "An error occurred while reading or writing to your custom script. "  + "It may have crashed with an error."),

regards

Suman

1 ACCEPTED SOLUTION

avatar
Cloudera Employee

Yes, both os and sys are available. As these are part of the standard library these come installed with python and have been verified to exist in the environment.

View solution in original post

3 REPLIES 3

avatar
New Contributor

Update in Subject line:

 

Error with Import sys and not Import os; Also in code I have imported Import sys

 

avatar
Cloudera Employee

Yes, both os and sys are available. As these are part of the standard library these come installed with python and have been verified to exist in the environment.

avatar
Contributor

I think you could have tried to use Java if that was an option.

 

I would actually prefer that to attempting it in Python.

 

I have generally had better luck doing SerDes and UDFs in Java.