Created on 12-19-2018 01:34 PM - edited 09-16-2022 01:45 AM
Hey folks,
I was recently doing some basic benchmarks and then I tried to use our classical DFSIO suite but I found myself stuck with the following error :
main : run as user is hdfs main : requested yarn user is hdfs Requested user hdfs is banned
As TestDFSIO default output dir is /benchmark/TestDFSIO and it is automatically created with the following permission :
inode="/benchmarks/TestDFSIO/io_control/in_file_test_io_0":hdfs:hdfs:drwxr-xr-x
The use of hdfs user is mandatory. At least that's what i thought...
After looking around some configuration files I found the banned.user property in :
/etc/hadoop/cong/container-executor.cfg #/* # * Licensed to the Apache Software Foundation (ASF) under one # * or more contributor license agreements. See the NOTICE file # * distributed with this work for additional information # * regarding copyright ownership. The ASF licenses this file # * to you under the Apache License, Version 2.0 (the # * "License"); you may not use this file except in compliance # * with the License. You may obtain a copy of the License at # * # * http://www.apache.org/licenses/LICENSE-2.0 # * # * Unless required by applicable law or agreed to in writing, software # * distributed under the License is distributed on an "AS IS" BASIS, # * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # * See the License for the specific language governing permissions and # * limitations under the License. # */ yarn.nodemanager.local-dirs=/hadoop/yarn/local yarn.nodemanager.log-dirs=/hadoop/yarn/log yarn.nodemanager.linux-container-executor.group=hadoop banned.users=hdfs,yarn,mapred,bin
As we can see, banned.users is populated with hdfs,yarn,mapred. This value is inherited from the J2 file :
/var/lib/ambari-server/resources/common-services/YARN/2.1.0.2.0/package/templates/container-executor.cfg.j2 yarn.nodemanager.local-dirs={{nm_local_dirs}} yarn.nodemanager.log-dirs={{nm_log_dirs}} yarn.nodemanager.linux-container-executor.group={{yarn_executor_container_group}} banned.users=hdfs,yarn,mapred,bin min.user.id={{min_user_id}}
So, at this point, there is two solutions in order to solve our original issue.
The first : remove hdfs from banned.users (not recommended)
The second : find a way to change the basedir or TestDFSIO. And this is what we are going to do.
If we look closer to the usage function of TestDFSIO there is no simple option to change the basedir, it seems that the default dir /benchmarks/TestDFSIO is hardcoded in the jar itself.
And it is WAS !
The possibility to change the output dir of TestDFSIO was asked in MAPREDUCE-1614 and incorporate in MAPREDUCE-1832. So now, it possible to use :
-Dtest.build.data=/path/of_output_dir
hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-tests.jar TestDFSIO -Dtest.build.data=/user/ambari-qa/TestDFSIO -write -nrFiles 10 -fileSize 1000 -resFile /root/dfsio_result.log 18/12/18 13:49:59 INFO fs.TestDFSIO: TestDFSIO.1.8 18/12/18 13:49:59 INFO fs.TestDFSIO: nrFiles = 10 18/12/18 13:49:59 INFO fs.TestDFSIO: nrBytes (MB) = 1000.0 18/12/18 13:49:59 INFO fs.TestDFSIO: bufferSize = 1000000 18/12/18 13:49:59 INFO fs.TestDFSIO: baseDir = /user/ambari-qa/TestDFSIO 18/12/18 13:50:00 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files 18/12/18 13:50:01 INFO fs.TestDFSIO: created control files for: 10 files ... 18/12/18 13:50:02 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: X.X.X.X:8020, Ident: (HDFS_DELEGATION_TOKEN token 33 for ambari-qa) ... 18/12/18 13:50:36 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write 18/12/18 13:50:36 INFO fs.TestDFSIO: Date & time: Tue Dec 18 13:50:36 UTC 2018 18/12/18 13:50:36 INFO fs.TestDFSIO: Number of files: 10 18/12/18 13:50:36 INFO fs.TestDFSIO: Total MBytes processed: 10000.0 18/12/18 13:50:36 INFO fs.TestDFSIO: Throughput mb/sec: 145.73223159766246 18/12/18 13:50:36 INFO fs.TestDFSIO: Average IO rate mb/sec: 153.26971435546875 18/12/18 13:50:36 INFO fs.TestDFSIO: IO rate std deviation: 39.996241684601024 18/12/18 13:50:36 INFO fs.TestDFSIO: Test exec time sec: 35.042 18/12/18 13:50:36 INFO fs.TestDFSIO:
Basic actions, like benchmarks should not change the default configuration of your cluster. Always try to tune/custom your basics action to fit your cluster rather than the opposit .
Created on 12-06-2019 12:09 PM
The following always worked for me:
kinit -kt hdfs.keytab hdfs
hadoop fs -mkdir /benchmarks
hadoop fs -chmod 0777 /benchmarks
You can always lock down the directory permissions to only allow a certain group to write to this directory.