Hi all, I'm putting up a log parser in Pig and I'm trying to use "Pyasn", a Python extension allowing offline querying of an ASN database, to extract Autonomous System Number information from IP addresses
3) But when I try to register my UDF, I get the following exception:
grunt> register 'hdfs:///user/xxxxxx/LIB/PYASN/python_pyasn.py' using jython as pythonPyasn;
2017-11-23 17:18:10,468 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2017-11-23 17:18:10,939 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine - created tmp python.cachedir=/tmp/pig_jython_8271942503558994412
2017-11-23 17:18:12,468 [main] WARN org.apache.pig.scripting.jython.JythonScriptEngine - pig.cmd.args.remainders is empty. This is not expected unless on testing.
2017-11-23 17:18:13,236 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1121: Python Error. Traceback (most recent call last):
File "/tmp/pig6864734086775637011tmp/python_pyasn.py", line 8, in <module>
File "/usr/lib64/python2.6/site-packages/pyasn-1.6.0b1-py2.6-linux-x86_64.egg/pyasn/__init__.py", line 20
SyntaxError: future feature print_function is not defined
4) The puzzling thing is that I'm currently doing the exact same thing with another Python extension for Geo Localization (PyGeoIP) and it works smoothly, the concept is the same, I wrote a UDF and imported it in Pig wrapping it up in Jython and I can call it successfully!
5) If, just to check things are formally OK, I open a PySpark Shell and use the extension, it works without any problems. But I don't want (can't) use Spark in this case, for a number of reasons