Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

Using kudu with Python

avatar
Explorer

Hi,

 

I am unable to use kudu with Python.

 

[cloudera@quickstart ~]$ sudo pip install kudu-python
Collecting kudu-python
Using cached kudu-python-1.2.0.tar.gz
Complete output from command python setup.py egg_info:
Cannot find installed kudu client.

----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-GE6kA6/kudu-python/

 

[cloudera@quickstart ~]$ kudu -version
kudu 1.2.0-cdh5.10.2
revision f8f159f3c55f72ef026776f6176855e4f6e70271
build type RELEASE
built by jenkins at 07 Jun 2017 01:01:28 PST on kudu-centos66-19fa.vpc.cloudera.com
build id 2017-06-07_00-28-13

 

[cloudera@quickstart ~]$ python -V
Python 2.7.13 :: Anaconda 4.4.0 (64-bit)

 

 

This is what I am following:

https://www.cloudera.com/documentation/kudu/latest/topics/kudu_development.html

 

Any help?

 

 

16 REPLIES 16

avatar
Rising Star
You need to first install the Kudu client system packages. Check out the documentation here: https://www.cloudera.com/documentation/kudu/latest/topics/kudu_installation.html#install_cmdline

Specifically, you need to install the 'kudu-client0' (RPM distros) or 'libkuduclient0' (DEB distros) package.

avatar
Explorer

Hi,

 

How do I find "kudu.master"?  

 

 

client = kudu.connect(host='kudu.master', port=7051)

 

I get an error when I use it as a localhost:

 

>>> client = kudu.connect(host='localhost', port=7051)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/cloudera/anaconda2/lib/python2.7/site-packages/kudu/__init__.py", line 92, in connect
rpc_timeout_ms=rpc_timeout_ms)
File "kudu/client.pyx", line 271, in kudu.client.Client.__cinit__ (kudu/client.cpp:5415)
File "kudu/errors.pyx", line 62, in kudu.errors.check_status (kudu/errors.cpp:1073)
kudu.errors.KuduBadStatus: Timed out: Could not connect to the cluster: ConnectToClusterRpc(addrs: 127.0.0.1:7051, num_attempts: 241) passed its deadline: Network error: Client connection negotiation failed: client connection to 127.0.0.1:7051: connect: Connection refused (error 111)
 

avatar
Rising Star
Bear in mind that this Python code merely connects and operates on an existing Kudu cluster; it doesn't start one for you. If you haven't yet created and started a Kudu service, you'll need to do that first.

If you have, the value for 'host' in kudu.connect() needs to be the valid hostname where you're running the Kudu master.

avatar
Explorer

Hi Adar,

 

My CDH cluster verison is 5.11.0.

I follow the steps to install kudu-client0,kudu-client-devel first.

 

rpm -qa|grep kudu
kudu-client-devel-1.3.0+cdh5.11.0+0-1.cdh5.11.0.p0.13.el6.x86_64
kudu-client0-1.3.0+cdh5.11.0+0-1.cdh5.11.0.p0.13.el6.x86_64

 

But when i install the kudu-python using the source packge (kudu-python-1.2.0). someother errors show up,

Could you help take a look ? 

Thanks

 

 

/opt/cloudera/parcels/Anaconda/bin/python setup.py install
Building from system prefix /usr
running install
running bdist_egg
running egg_info
writing requirements to kudu_python.egg-info/requires.txt
writing kudu_python.egg-info/PKG-INFO
writing top-level names to kudu_python.egg-info/top_level.txt
writing dependency_links to kudu_python.egg-info/dependency_links.txt
reading manifest file 'kudu_python.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '../LICENSE.txt'
warning: no previously-included files matching '*.so' found anywhere in distribution
warning: no previously-included files matching '*.pyc' found anywhere in distribution
warning: no previously-included files matching '*~' found anywhere in distribution
warning: no previously-included files matching '#*' found anywhere in distribution
warning: no previously-included files matching '.git*' found anywhere in distribution
warning: no previously-included files matching '.DS_Store' found anywhere in distribution
writing manifest file 'kudu_python.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
copying kudu/version.py -> build/lib.linux-x86_64-2.7/kudu
running build_ext
building 'kudu.client' extension
gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/usr/include -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c kudu/client.cpp -o build/temp.linux-x86_64-2.7/kudu/client.o
cc1plus: warning: command line option "-Wstrict-prototypes" is valid for Ada/C/ObjC but not for C++
g++ -pthread -shared -L/tmp/tmpQcQ0Zo/Anaconda-4.0.0/lib -Wl,-rpath=/tmp/tmpQcQ0Zo/Anaconda-4.0.0/lib,--no-as-needed build/temp.linux-x86_64-2.7/kudu/client.o -L/usr/lib -L/tmp/tmpQcQ0Zo/Anaconda-4.0.0/lib -Wl,-R/usr/lib -lkudu_client -lpython2.7 -o build/lib.linux-x86_64-2.7/kudu/client.so
/usr/bin/ld: cannot find -lpython2.7
collect2: ld returned 1 exit status
error: command 'g++' failed with exit status 1

avatar
Rising Star

Looks like you don't have the Python development packages installed. On my Ubuntu system that's the libpythonx.y-dev package, where 'x' is 2 and 'y' is 7. On Red Hat or Cent OS systems I think you want to install the python-devel package. On my test Cent OS 6.6 system python-devel provides /usr/lib64/libpython2.6.so.

avatar
Explorer

Hello Adar,

 

That link leads to 404.

We have Kudu parcel. What should we do now in order to install the python-kudu via pip?

 

Thanks

avatar
Expert Contributor

Hi elkarel,

This is a very old thread and has some outdated information and expired links.

 

Please include any problems you are facing including any error messages you are seeing, your version, platform, etc.

 

Regarding your question about the documentation, there is no longer any special procedure to install Kudu. With the latest versions of CDH, Kudu is included in the parcel: https://www.cloudera.com/documentation/enterprise/5-14-x/topics/kudu_install_cm.html

 

Thanks,

Mike

avatar
Expert Contributor

I looked at this a little more and I think kudu-python install on RHEL 6 / Centos 6 is broken because of the lack of Python 2.6 support in newer versions of pip. There are a couple of related JIRAs on this issue: https://issues.apache.org/jira/browse/KUDU-1705 and https://issues.apache.org/jira/browse/KUDU-2442

 

Another issue is that the parcel does not link the include files and libraries into a place that pip can find them. You may be able to work around that with putting symlinks from /opt/cloudera/parcels/CDH/includes/kudu into /usr/include/kudu and /opt/cloudera/parcels/CDH/lib64/*kudu* into /usr/lib64/ to see if that helps solve the problem on a newer OS version.

avatar
Expert Contributor

I looked into this a little more today, out of interest, and added some additional info about how to make it work on EL6 here: https://issues.apache.org/jira/browse/KUDU-2442

 

Regardless, if you are using a parcel install, you must first symlink the include files and the libraries into /usr/ or /usr/local, something like this:

 

sudo ln -s /opt/cloudera/parcels/CDH/include/kudu /usr/local/include/
sudo ln -s /opt/cloudera/parcels/CDH/lib64/libkudu_client.so /usr/local/lib64/
sudo ln -s /opt/cloudera/parcels/CDH/lib64/libkudu_client.so.0 /usr/local/lib64/
sudo ln -s /opt/cloudera/parcels/CDH/lib64/libkudu_client.so.0.1.0 /usr/local/lib64/

avatar
Explorer

Thank you very much Mike for the follow-up!

 

Our client wanted us to stop using python, so I cannot verify your solution.

 

The system is CentOS 7.4.1708. 

 

I will return to this Jira issue in future if there is a need to continue with python-kudu.

 

Best Regards

avatar
Expert Contributor

With Centos 7 I think the only thing required to make "pip install kudu-python" work out-of-the-box is to add the symlinks I mentioned. I'm going to work on getting those added by default as part of the parcel install in a future release.

avatar
Explorer
Nope, that "pip install kudu-python" certainly didn't work for me, claiming
"Cannot find installed kudu client.".
Oviously, kudu was accessible from CLI, so I guess it searches for some
directories that don't get created in parcel install.

Thanks

avatar
Expert Contributor

Yes -- I saw the same behavior, and the workaround is adding the symlinks I detailed in a previous comment in this thread.

avatar
Explorer

Is there a chance a connect to multiple masters to allow Python client choose the needed leader itself?

 

Current connection string

 

# Connect to Kudu master server
client = kudu.connect(host='kudu.master', port=7051)

 

may prevent some automation tasks, because I have to choose the right leader first.

avatar
Cloudera Employee
You can connect to multiple masters by providing a list of master addresses.

avatar
Contributor

I'm terribly sorry for the necro-bump. The issue I'm experiencing seems to be the same one the OP was facing.

We're running CDH 6.3, and I encountered a problem when trying to install the Kudu Python client. I created the symlinks as recommended by mpercy, but I'm still unable to install the kudu-python client.

Our cluster does not have direct access to the internet, so when possible we use an offline install. The official documentation says that the Kudu C++ client libraries and headers are needed for the Kudu Python client. On Oracle Linux 7 trying to install devtoolset-3-toolchain ends in a failure, as a number of dependencies are missing:

 

Spoiler
Error: Package: devtoolset-3-gdb-7.8.2-38.el6.x86_64 (rhscl-devtoolset-3-epel-6-x86_64)
Requires: libpython2.6.so.1.0()(64bit)
Error: Package: devtoolset-3-gcc-gfortran-4.9.2-6.el6.x86_64 (rhscl-devtoolset-3-epel-6-x86_64)
Requires: libmpfr.so.1()(64bit)
Error: Package: devtoolset-3-gcc-c++-4.9.2-6.el6.x86_64 (rhscl-devtoolset-3-epel-6-x86_64)
Requires: libgmp.so.3()(64bit)
Error: Package: devtoolset-3-gcc-4.9.2-6.el6.x86_64 (rhscl-devtoolset-3-epel-6-x86_64)
Requires: libgmp.so.3()(64bit)
Error: Package: devtoolset-3-gcc-4.9.2-6.el6.x86_64 (rhscl-devtoolset-3-epel-6-x86_64)
Requires: libmpfr.so.1()(64bit)
Error: Package: devtoolset-3-gcc-gfortran-4.9.2-6.el6.x86_64 (rhscl-devtoolset-3-epel-6-x86_64)
Requires: libgmp.so.3()(64bit)
Error: Package: devtoolset-3-gcc-c++-4.9.2-6.el6.x86_64 (rhscl-devtoolset-3-epel-6-x86_64)
Requires: libmpfr.so.1()(64bit)

Disregarding that, running pip install --no-index --find-links file:///data0/home/jkovacs/kudu-python-1.10.0.tar.gz  results in the following errors:

Spoiler
   ERROR: Command errored out with exit status 1:
     command: /usr/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-rK2Qbk/kudu-python/setup.py'"'"'; __file__='"'"'/tmp/pip-install-rK2Qbk/kudu-python/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-rK2Qbk/kudu-python/pip-egg-info
         cwd: /tmp/pip-install-rK2Qbk/kudu-python/
    Complete output (43 lines):
    Building from system prefix /usr/local
    /usr/lib64/python2.7/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-rK2Qbk/kudu-python/kudu/client.pxd
      tree = Parsing.p_module(s, pxd, full_module_name)
    /usr/lib64/python2.7/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-rK2Qbk/kudu-python/kudu/schema.pxd
      tree = Parsing.p_module(s, pxd, full_module_name)
    /usr/lib64/python2.7/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-rK2Qbk/kudu-python/kudu/errors.pxd
      tree = Parsing.p_module(s, pxd, full_module_name)
    Compiling kudu/client.pyx because it depends on kudu/config.pxi.
    Compiling kudu/errors.pyx because it depends on /usr/lib64/python2.7/site-packages/Cython/Includes/libcpp/string.pxd.
    Compiling kudu/schema.pyx because it depends on kudu/config.pxi.
    [1/3] Cythonizing kudu/client.pyx
    [2/3] Cythonizing kudu/schema.pyx
    [3/3] Cythonizing kudu/errors.pyx
    WARNING: The wheel package is not available.
    DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at <a href="<a href="https://pip.pypa.io/en/latest/development/release-process/#python-2-support" target="_blank">https://pip.pypa.io/en/latest/development/release-process/#python-2-support</a>" target="_blank"><a href="https://pip.pypa.io/en/latest/development/release-process/#python-2-support</a" target="_blank">https://pip.pypa.io/en/latest/development/release-process/#python-2-support</a</a>>
    WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7fe8bbca3910>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/pytest-runner/
    WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7fe8bbca3110>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/pytest-runner/
    WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7fe8bbca3090>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/pytest-runner/
    WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7fe8bbca9e50>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/pytest-runner/
    WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<pip._vendor.urllib3.connection.VerifiedHTTPSConnection object at 0x7fe8bbca9e90>, 'Connection to pypi.org timed out. (connect timeout=15)')': /simple/pytest-runner/
    ERROR: Could not find a version that satisfies the requirement pytest-runner (from versions: none)
    ERROR: No matching distribution found for pytest-runner
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-rK2Qbk/kudu-python/setup.py", line 216, in <module>
        test_suite="kudu.tests"
      File "/usr/lib/python2.7/site-packages/setuptools/__init__.py", line 144, in setup
        _install_setup_requires(attrs)
      File "/usr/lib/python2.7/site-packages/setuptools/__init__.py", line 139, in _install_setup_requires
        dist.fetch_build_eggs(dist.setup_requires)
      File "/usr/lib/python2.7/site-packages/setuptools/dist.py", line 721, in fetch_build_eggs
        replace_conflicting=True,
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 782, in resolve
        replace_conflicting=replace_conflicting
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1065, in best_match
        return self.obtain(req, installer)
      File "/usr/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1077, in obtain
        return installer(requirement)
      File "/usr/lib/python2.7/site-packages/setuptools/dist.py", line 777, in fetch_build_egg
        return fetch_build_egg(self, req)
      File "/usr/lib/python2.7/site-packages/setuptools/installer.py", line 121, in fetch_build_egg
        raise DistutilsError(str(e))
    distutils.errors.DistutilsError: Command '['/usr/bin/python', '-m', 'pip', '--disable-pip-version-check', 'wheel', '--no-deps', '-w', '/tmp/tmp8eqkO2', '--quiet', 'pytest-runner']' returned non-zero exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

I'd like to ask for some guidance, what to try next?

 

 

Thank you!

 

Labels