Created 06-20-2017 05:00 AM
I have deployed a cluster using Ambari 2.5 and HDP 2.6. This is my first time trying this and I must have made a fundamental mistake somewhere along the way.
Anyway, I can't write anything to HDFS. I was able to create the directory /user/hdfs, but this is what happens when I try to copy a file to it (I am logged in as user hdfs on a machine running the name node and a data node):
[hdfs@mds-hdp-01 0]$ hadoop fs -ls /user Found 9 items drwx------ - accumulo hdfs 0 2017-06-14 04:44 /user/accumulo drwxrwx--- - ambari-qa hdfs 0 2017-06-14 04:48 /user/ambari-qa drwxr-xr-x - hbase hdfs 0 2017-06-14 04:42 /user/hbase drwxr-xr-x - hcat hdfs 0 2017-06-14 04:45 /user/hcat drwxr-xr-x - hdfs hdfs 0 2017-06-20 00:48 /user/hdfs drwxr-xr-x - hive hdfs 0 2017-06-14 04:45 /user/hive drwxrwxr-x - oozie hdfs 0 2017-06-14 04:46 /user/oozie drwxrwxr-x - spark hdfs 0 2017-06-14 04:43 /user/spark drwxr-xr-x - zeppelin hdfs 0 2017-06-14 04:43 /user/zeppelin [hdfs@mds-hdp-01 0]$ hadoop fs -ls /user/hdfs [hdfs@mds-hdp-01 0]$ cd [hdfs@mds-hdp-01 ~]$ hadoop fs -put foo.txt /user/hdfs/ 17/06/20 03:41:36 INFO hdfs.DFSClient: Exception in createBlockOutputStream org.apache.hadoop.hdfs.security.token.block.InvalidBlockTokenException: Got access token error, status message , ack with firstBadLink as 118.138.237.114:50010 at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1478) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1380) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:558) 17/06/20 03:41:36 INFO hdfs.DFSClient: Abandoning BP-23056860-118.138.237.114-1497415333069:blk_1073756578_15770 17/06/20 03:41:36 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[118.138.237.114:50010,DS-d409f5ad-e087-4597-b9b4-d356d005a9de,DISK] 17/06/20 03:41:36 INFO hdfs.DFSClient: Exception in createBlockOutputStream org.apache.hadoop.hdfs.security.token.block.InvalidBlockTokenException: Got access token error, status message , ack with firstBadLink as 118.138.237.168:50010 at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1478) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1380) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:558) 17/06/20 03:41:36 INFO hdfs.DFSClient: Abandoning BP-23056860-118.138.237.114-1497415333069:blk_1073756579_15771 17/06/20 03:41:36 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[118.138.237.168:50010,DS-6cd39438-2e05-4669-ac31-b77c408b1efa,DISK] 17/06/20 03:41:36 INFO hdfs.DFSClient: Exception in createBlockOutputStream org.apache.hadoop.hdfs.security.token.block.InvalidBlockTokenException: Got access token error, status message , ack with firstBadLink as 118.138.237.115:50010 at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1478) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1380) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:558) 17/06/20 03:41:36 INFO hdfs.DFSClient: Abandoning BP-23056860-118.138.237.114-1497415333069:blk_1073756580_15772 17/06/20 03:41:36 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[118.138.237.115:50010,DS-663bfe98-7250-423e-925e-c5144eedd9ba,DISK] 17/06/20 03:41:36 INFO hdfs.DFSClient: Exception in createBlockOutputStream org.apache.hadoop.hdfs.security.token.block.InvalidBlockTokenException: Got access token error, status message , ack with firstBadLink as 118.138.237.116:50010 at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:134) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1478) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1380) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:558) 17/06/20 03:41:36 INFO hdfs.DFSClient: Abandoning BP-23056860-118.138.237.114-1497415333069:blk_1073756581_15773 17/06/20 03:41:36 INFO hdfs.DFSClient: Excluding datanode DatanodeInfoWithStorage[118.138.237.116:50010,DS-85446345-b1e6-455f-bbee-9406d4e5ef69,DISK] 17/06/20 03:41:36 WARN hdfs.DFSClient: DataStreamer Exception java.io.IOException: Unable to create new block. at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1393) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:558) 17/06/20 03:41:36 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/hdfs/foo.txt._COPYING_" - Aborting... put: Got access token error, status message , ack with firstBadLink as 118.138.237.116:50010
Trying to upload a file via the Ambari Web UI fails with server status 500 and the message
Unauthorized connection for super-user: root from IP 118.138.237.168
This also puzzles me because the IP address in the error message belongs to one of the data nodes, not the Ambari server.
Any hints where I may have misconfigured something?
Created 06-20-2017 05:05 AM
Which version of Ambari Server are you using?
If you want to use the Ambari File View then you will need to perform few additional setup before using it.
.
Like following (Proxy User Setup)
- 1. If you are running your ambari server as root user then you will need to add the following properties in the
Ambari UI --> Services --> HDFS --> Configs --> Advanced tab --> Custom core-site
.
hadoop.proxyuser.root.groups=* hadoop.proxyuser.root.hosts=*
Here Instead of "hadoop.proxyuser.root.hosts" value as "*" you can also define the comma separated list of Hosts/addresses where the FileView will be running. This is needed for the Files View to access HDFS, the Ambari Server daemon hosting the view needs to act as the proxy user for HDFS. This allows Ambari to submit requests to HDFS on behalf of the users using the Files View.
Based on your error we see that you are running your ambari server as "root" user. But the IP Address from which the FileView is running is not allowed to be proxied that's why you are getting this error:
Unauthorized connection for super-user: root from IP 118.138.237.168
- 2. If you have logged in to Ambari File View as an "admin" user then you will need to make sure that you have created a directory for that user in HDFS as following: (if you have logged in to ambari file view with some other user then you will need to create the user specific directory in advance in HDFS) Example:
su - hdfs hdfs dfs -mkdir /user/admin hdfs dfs -chown admin:hadoop /user/admin
3. Now you should be able to login and perform various operations to Ambari File View without any issue.
.
Created 06-20-2017 06:01 AM
Thanks for the quick answer.
The proxy user is already set up as you indicate and I can use the file view to browse HDFS.
My main problem seems to be somewhere else: As I indicated in the first part of my question I can't put files on HDFS from the command line using hadoop fs -put (or hdfs dfs -put, same thing).
Created 06-20-2017 06:05 AM
Is this the Address of your ambari server "118.138.237.168" And are you accessing ambari File View directly or via some WebServer in front.
Created 06-20-2017 06:18 AM
The IP address in the error message is not that of the ambari server and it also varies. I am accessing the file view by pointing my browser to https://ambari-server and I don't think there is anything in between.
What I'm trying to say is that I don't care so much about the Ambari File View but I seem to have a fundamental HDFS configuration problem. I'm not able to upload any data to my HDFS.
Created 06-21-2017 02:57 PM
It turns out it was a network configuration error. In `/etc/hosts` I first had a short name for each machine, followed by the fully qualified domain name. Apparently the order matters: After turning this around the access token issue appears to be gone. Example line:
118.138.237.114 mds-hdp-01.erc.monash.edu.au mds-hdp-01.erc.monash.edu mds-hdp-01
(before that I had the short name mds-hdp-01 first).