Support Questions

Find answers, ask questions, and share your expertise

libhdfs missing

avatar

I'm currently using Hortonworks 3.0.0.0-1634 (installed ~ 2 weeks ago). The system itself is great, but I can't seem to get libhdfs loaded into pyarrow. Which makes ingestion difficult.

The libhdfs0 package is installed on the systems, but when I try to actually find the .so file, it is a broken link:

root@use1-hadoop-5:~/compact# ls -larth /usr/hdp/3.0.0.0-1634/usr/lib/
total 8.0K
lrwxrwxrwx 1 root root   16 Jul 12 21:06 libhdfs.so -> libhdfs.so.0.0.0
drwxr-xr-x 4 root root 4.0K Sep 21 19:00 ..
drwxr-xr-x 2 root root 4.0K Sep 21 19:01 .

Am I missing something here?

Example failure:

root@use1-hadoop-5:~/compact# python3 
Python 3.5.2 (default, Nov 23 2017, 16:37:01) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.environ["HADOOP_HOME"] = "/usr/hdp/current/hadoop-client"
>>> os.environ["JAVA_HOME"] = "/usr/jdk64/jdk1.8.0_112/"
>>> import subprocess
>>> classpath = subprocess.Popen(["/usr/hdp/current/hadoop-client/bin/hdfs", "classpath", "--glob"], stdout=subprocess.PIPE).communicate()[0]
>>> os.environ["CLASSPATH"] = classpath.decode("utf-8")
>>> import pyarrow as pa
>>> fs = pa.hdfs.connect("use1-hadoop-namenode-1.datto.lan", 50070, user="hdfs")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/pyarrow/hdfs.py", line 183, in connect
    extra_conf=extra_conf)
  File "/usr/local/lib/python3.5/dist-packages/pyarrow/hdfs.py", line 37, in __init__
    self._connect(host, port, user, kerb_ticket, driver, extra_conf)
  File "pyarrow/io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect
  File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Unable to load libhdfs

10 REPLIES 10

avatar
New Contributor

On my version (6.3.3) 
It is found not in CDH/lib/hadoop/lib where it gets looked for, but out in 

 CDH/lib64 for some reason.

A symlink from hadoop/native out to lib64 would solve it.
cloudera/parcels/CDH/lib64/libhdfs.so