Member since
09-21-2018
6
Posts
3
Kudos Received
0
Solutions
01-25-2019
01:14 AM
So having now installed a 3.1.0 cluster on Ubuntu, I see this package still doesn't install an actual .so file for libhdfs.
root@use1-monstage-1:~# find /usr -name libhdfs*
/usr/share/doc/libhdfs0-3-1-0-0-78
/usr/lib/ams-hbase/lib/hadoop-native/libhdfs.so
/usr/lib/ams-hbase/lib/hadoop-native/libhdfs.a
/usr/hdp/3.1.0.0-78/hadoop/lib/native/libhdfs.a
/usr/hdp/3.1.0.0-78/usr/lib/libhdfs.so
root@use1-monstage-1:~# ls -larth /usr/hdp/3.1.0.0-78/usr/lib/libhdfs.so
lrwxrwxrwx 1 root root 16 Dec 6 13:58 /usr/hdp/3.1.0.0-78/usr/lib/libhdfs.so -> libhdfs.so.0.0.0
root@use1-monstage-1:~# ls -larth /usr/lib/ams-hbase/lib/hadoop-native/libhdfs.so
-rwxr-xr-x 1 root root 292K Dec 7 18:13 /usr/lib/ams-hbase/lib/hadoop-native/libhdfs.so
The ams-hbase package seems to have an actual libhdfs.so file, but trying to link against that doesn't work correctly.
Is there any chance we'll get libhdfs properly packaged for HDP?
... View more
01-07-2019
06:16 PM
2 Kudos
So a few months ago I posted this question: https://community.hortonworks.com/questions/222259/libhdfs-missing.html But I realized I did a terrible job posting the question, and consequently, I didn't really get the help I was hoping for. So attempt #2: I am running HDP 3.0.1 on Ubuntu 16.04 nodes. We initially installed 3.0.0, and successfully upgraded later. The cluster itself is running fine and Ambari is a great administration tool. However... libhdfs.so.0.0.0 does not exist in our cluster. root@use1-mon-flink-1:~# dpkg -l | grep libhdfs
ii libhdfs0 3.1.1.3.0.1.0-187 all libhdfs0 is a virtual package that brings libhdfs0-3-0-1-0-187 as a dependency.
ii libhdfs0-3-0-1-0-187 3.1.1.3.0.1.0-187 amd64 Hadoop Filesystem Library
root@use1-mon-flink-1:~# ls -larth /usr/hdp/3.0.1.0-187/usr/lib/
total 8.0K
lrwxrwxrwx 1 root root 16 Sep 19 11:31 libhdfs.so -> libhdfs.so.0.0.0
drwxr-xr-x 4 root root 4.0K Oct 11 14:32 ..
drwxr-xr-x 2 root root 4.0K Oct 11 14:32 .
root@use1-mon-flink-1:~# find /usr -name "libhdfs.so.0.0.0"
root@use1-mon-flink-1:~#
root@use1-mon-flink-1:~# apt search libhdfs
Sorting... Done
Full Text Search... Done
libhdfs0/unknown,now 3.1.1.3.0.1.0-187 all [installed]
libhdfs0 is a virtual package that brings libhdfs0-3-0-1-0-187 as a dependency.
libhdfs0-3-0-1-0-187/unknown,now 3.1.1.3.0.1.0-187 amd64 [installed]
Hadoop Filesystem Library
root@use1-mon-flink-1:~# apt install --reinstall libhdfs*
Reading package lists... Done
Building dependency tree
Reading state information... Done
Note, selecting 'libhdfs0' for glob 'libhdfs*'
Note, selecting 'libhdfs0-3-0-1-0-187' for glob 'libhdfs*'
0 upgraded, 0 newly installed, 2 reinstalled, 0 to remove and 2 not upgraded.
Need to get 1002 B/2412 B of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://public-repo-1.hortonworks.com/HDP/ubuntu16/3.x/updates/3.0.1.0 HDP/main amd64 libhdfs0 all 3.1.1.3.0.1.0-187 [1002 B]
Fetched 1002 B in 0s (6586 B/s)
[master 90d1838] saving uncommitted changes in /etc prior to apt run
3 files changed, 57 insertions(+), 2 deletions(-)
rewrite oozie/3.0.1.0-187/0/oozie-site.jceks (64%)
(Reading database ... 132397 files and directories currently installed.)
Preparing to unpack .../libhdfs0_3.1.1.3.0.1.0-187_all.deb ...
Unpacking libhdfs0 (3.1.1.3.0.1.0-187) over (3.1.1.3.0.1.0-187) ...
Preparing to unpack .../libhdfs0-3-0-1-0-187_3.1.1.3.0.1.0-187_amd64.deb ...
Unpacking libhdfs0-3-0-1-0-187 (3.1.1.3.0.1.0-187) over (3.1.1.3.0.1.0-187) ...
Setting up libhdfs0-3-0-1-0-187 (3.1.1.3.0.1.0-187) ...
Setting up libhdfs0 (3.1.1.3.0.1.0-187) ...
root@use1-mon-flink-1:~# ls -larth /usr/hdp/3.0.1.0-187/usr/lib/
total 8.0K
lrwxrwxrwx 1 root root 16 Sep 19 11:31 libhdfs.so -> libhdfs.so.0.0.0
drwxr-xr-x 4 root root 4.0K Oct 11 14:32 ..
drwxr-xr-x 2 root root 4.0K Jan 7 15:00 .
root@use1-mon-flink-1:~#
As you can see, even installing the Hortonworks package that should install libhdfs doesn't install the actual .so file. Just a dead symlink. Has anyone else experienced this problem? Note: I can also reproduce this problem on a fresh Ubuntu 16.04 VM. The initial install of libhdfs0 and libhdfs0-3-0-1-0-187 don't install an actual .so file.
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)
09-24-2018
03:48 PM
Appreciate the suggestion, but I did actually try that: root@use1-hadoop-5:~/ingest_hive# ls -larth /usr/hdp/3.0.0.0-1634/usr/lib/
total 8.0K
lrwxrwxrwx 1 datto datto 16 Aug 2 04:31 libhdfs.so -> libhdfs.so.0.0.0
drwxr-xr-x 4 root root 4.0K Sep 21 19:00 ..
drwxr-xr-x 2 root root 4.0K Sep 24 14:06 .
root@use1-hadoop-5:~/ingest_hive# dpkg -l | grep libhdfs
ii libhdfs0 3.1.0.3.0.0.0-1634 all libhdfs0 is a virtual package that brings libhdfs0-3-0-0-0-1634 as a dependency.
ii libhdfs0-3-0-0-0-1634 3.1.0.3.0.0.0-1634 amd64 Hadoop Filesystem Library
root@use1-hadoop-5:~/ingest_hive# apt-get install --reinstall libhdfs0 libhdfs0-3-0-0-0-1634
Reading package lists... Done
Building dependency tree
Reading state information... Done
0 upgraded, 0 newly installed, 2 reinstalled, 0 to remove and 21 not upgraded.
Need to get 2,416 B of archives.
After this operation, 0 B of additional disk space will be used.
Get:1 http://public-repo-1.hortonworks.com/HDP/ubuntu16/3.x/updates/3.0.0.0 HDP/main amd64 libhdfs0 all 3.1.0.3.0.0.0-1634 [1,006 B]
Get:2 http://public-repo-1.hortonworks.com/HDP/ubuntu16/3.x/updates/3.0.0.0 HDP/main amd64 libhdfs0-3-0-0-0-1634 amd64 3.1.0.3.0.0.0-1634 [1,410 B]
Fetched 2,416 B in 0s (15.1 kB/s)
[master 70a37c9] saving uncommitted changes in /etc prior to apt run
3 files changed, 1 insertion(+), 1 deletion(-)
rewrite hive/3.0.0.0-1634/0/hive-site.jceks (63%)
rewrite oozie/3.0.0.0-1634/0/oozie-site.jceks (64%)
(Reading database ... 152223 files and directories currently installed.)
Preparing to unpack .../libhdfs0_3.1.0.3.0.0.0-1634_all.deb ...
Unpacking libhdfs0 (3.1.0.3.0.0.0-1634) over (3.1.0.3.0.0.0-1634) ...
Preparing to unpack .../libhdfs0-3-0-0-0-1634_3.1.0.3.0.0.0-1634_amd64.deb ...
Unpacking libhdfs0-3-0-0-0-1634 (3.1.0.3.0.0.0-1634) over (3.1.0.3.0.0.0-1634) ...
Setting up libhdfs0-3-0-0-0-1634 (3.1.0.3.0.0.0-1634) ...
Setting up libhdfs0 (3.1.0.3.0.0.0-1634) ...
root@use1-hadoop-5:~/ingest_hive# ls -larth /usr/hdp/3.0.0.0-1634/usr/lib/
total 8.0K
lrwxrwxrwx 1 root root 16 Jul 12 21:06 libhdfs.so -> libhdfs.so.0.0.0
drwxr-xr-x 4 root root 4.0K Sep 21 19:00 ..
drwxr-xr-x 2 root root 4.0K Sep 24 14:06 .
... View more
09-23-2018
01:47 PM
Turns out you can literally copy the file into place from a binary hadoop build and fix the error. Unfortunately... After copying the file into place, I seem to get a new error: >>> import os
>>> os.environ["HADOOP_HOME"] = "/usr/hdp/current/hadoop-client"
>>> os.environ["JAVA_HOME"] = "/usr/jdk64/jdk1.8.0_112/"
>>> import subprocess
>>> classpath = subprocess.Popen(["/usr/hdp/current/hadoop-client/bin/hdfs", "classpath", "--glob"], stdout=subprocess.PIPE).communicate()[0]
>>> os.environ["CLASSPATH"] = classpath.decode("utf-8")
>>> import pyarrow as pa
>>> fs = pa.hdfs.connect("use1-hadoop-namenode-1.datto.lan", 50070, user="hdfs")
18/09/21 20:03:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/09/21 20:03:26 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
>>> fs.df()
18/09/21 20:03:34 WARN net.NetUtils: Unable to wrap exception of type class org.apache.hadoop.ipc.RpcException: it has no (String) constructor
java.lang.NoSuchMethodException: org.apache.hadoop.ipc.RpcException.<init>(java.lang.String)
at java.lang.Class.getConstructor0(Class.java:3082)
at java.lang.Class.getConstructor(Class.java:1825)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:830)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1503)
at org.apache.hadoop.ipc.Client.call(Client.java:1445)
at org.apache.hadoop.ipc.Client.call(Client.java:1355)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.getFsStats(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getStats(ClientNamenodeProtocolTranslatorPB.java:705)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy11.getStats(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getStateByIndex(DFSClient.java:1921)
at org.apache.hadoop.hdfs.DFSClient.getDiskStatus(DFSClient.java:1930)
at org.apache.hadoop.hdfs.DistributedFileSystem.getStatus(DistributedFileSystem.java:1373)
at org.apache.hadoop.fs.FileSystem.getStatus(FileSystem.java:2803)
hdfsGetCapacity: FileSystem#getStatus error:
RpcException: RPC response exceeds maximum data lengthjava.io.IOException: Failed on local exception: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length; Host Details : local host is: "us
e1-hadoop-5/10.40.80.91"; destination host is: "use1-hadoop-namenode-1.datto.lan":50070;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:816)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1503)
at org.apache.hadoop.ipc.Client.call(Client.java:1445)
at org.apache.hadoop.ipc.Client.call(Client.java:1355)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy10.getFsStats(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getStats(ClientNamenodeProtocolTranslatorPB.java:705)
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pyarrow/io-hdfs.pxi", line 194, in pyarrow.lib.HadoopFileSystem.df
File "pyarrow/io-hdfs.pxi", line 170, in pyarrow.lib.HadoopFileSystem.get_capacity
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: HDFS GetCapacity failed, errno: 255 (Unknown error 255)
I currently have the IPC size set to around 1GB and still get this error.
... View more
09-23-2018
01:47 PM
As a note, literally copying libhdfs.so from a hadoop distro into the mentioned folder fixes this problem... That is:
download the binary tarball from http://apache.claz.org/hadoop/common/hadoop-3.1.1/ untar rsync -aP <folder>/lib/native/libhdfs.so* use1-hadoop-5:/usr/hdp/3.0.0.0-1634/usr/lib/ profit! root@use1-hadoop-5:~/compact# ls -larth /usr/hdp/3.0.0.0-1634/usr/lib/
total 300K
-rwxr-xr-x 1 datto datto 291K Aug 2 04:31 libhdfs.so.0.0.0
lrwxrwxrwx 1 datto datto 16 Aug 2 04:31 libhdfs.so -> libhdfs.so.0.0.0
drwxr-xr-x 4 root root 4.0K Sep 21 19:00 ..
drwxr-xr-x 2 root root 4.0K Sep 21 19:30 .
... View more
09-23-2018
01:47 PM
1 Kudo
I'm currently using Hortonworks 3.0.0.0-1634 (installed ~ 2 weeks ago). The system itself is great, but I can't seem to get libhdfs loaded into pyarrow. Which makes ingestion difficult. The libhdfs0 package is installed on the systems, but when I try to actually find the .so file, it is a broken link: root@use1-hadoop-5:~/compact# ls -larth /usr/hdp/3.0.0.0-1634/usr/lib/
total 8.0K
lrwxrwxrwx 1 root root 16 Jul 12 21:06 libhdfs.so -> libhdfs.so.0.0.0
drwxr-xr-x 4 root root 4.0K Sep 21 19:00 ..
drwxr-xr-x 2 root root 4.0K Sep 21 19:01 .
Am I missing something here? Example failure: root@use1-hadoop-5:~/compact# python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.environ["HADOOP_HOME"] = "/usr/hdp/current/hadoop-client"
>>> os.environ["JAVA_HOME"] = "/usr/jdk64/jdk1.8.0_112/"
>>> import subprocess
>>> classpath = subprocess.Popen(["/usr/hdp/current/hadoop-client/bin/hdfs", "classpath", "--glob"], stdout=subprocess.PIPE).communicate()[0]
>>> os.environ["CLASSPATH"] = classpath.decode("utf-8")
>>> import pyarrow as pa
>>> fs = pa.hdfs.connect("use1-hadoop-namenode-1.datto.lan", 50070, user="hdfs")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.5/dist-packages/pyarrow/hdfs.py", line 183, in connect
extra_conf=extra_conf)
File "/usr/local/lib/python3.5/dist-packages/pyarrow/hdfs.py", line 37, in __init__
self._connect(host, port, user, kerb_ticket, driver, extra_conf)
File "pyarrow/io-hdfs.pxi", line 89, in pyarrow.lib.HadoopFileSystem._connect
File "pyarrow/error.pxi", line 83, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Unable to load libhdfs
... View more
Labels: