- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 12-19-2018 01:34 PM - edited 09-16-2022 01:45 AM
Hey folks,
I was recently doing some basic benchmarks and then I tried to use our classical DFSIO suite but I found myself stuck with the following error :
main : run as user is hdfs main : requested yarn user is hdfs Requested user hdfs is banned
As TestDFSIO default output dir is /benchmark/TestDFSIO and it is automatically created with the following permission :
inode="/benchmarks/TestDFSIO/io_control/in_file_test_io_0":hdfs:hdfs:drwxr-xr-x
The use of hdfs user is mandatory. At least that's what i thought...
Why HDFS user is banned ?
After looking around some configuration files I found the banned.user property in :
/etc/hadoop/cong/container-executor.cfg #/* # * Licensed to the Apache Software Foundation (ASF) under one # * or more contributor license agreements. See the NOTICE file # * distributed with this work for additional information # * regarding copyright ownership. The ASF licenses this file # * to you under the Apache License, Version 2.0 (the # * "License"); you may not use this file except in compliance # * with the License. You may obtain a copy of the License at # * # * http://www.apache.org/licenses/LICENSE-2.0 # * # * Unless required by applicable law or agreed to in writing, software # * distributed under the License is distributed on an "AS IS" BASIS, # * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # * See the License for the specific language governing permissions and # * limitations under the License. # */ yarn.nodemanager.local-dirs=/hadoop/yarn/local yarn.nodemanager.log-dirs=/hadoop/yarn/log yarn.nodemanager.linux-container-executor.group=hadoop banned.users=hdfs,yarn,mapred,bin
As we can see, banned.users is populated with hdfs,yarn,mapred. This value is inherited from the J2 file :
/var/lib/ambari-server/resources/common-services/YARN/2.1.0.2.0/package/templates/container-executor.cfg.j2 yarn.nodemanager.local-dirs={{nm_local_dirs}} yarn.nodemanager.log-dirs={{nm_log_dirs}} yarn.nodemanager.linux-container-executor.group={{yarn_executor_container_group}} banned.users=hdfs,yarn,mapred,bin min.user.id={{min_user_id}}
So, at this point, there is two solutions in order to solve our original issue.
The first : remove hdfs from banned.users (not recommended)
The second : find a way to change the basedir or TestDFSIO. And this is what we are going to do.
TestDFSIO : is the output dir really hardcoded ?
If we look closer to the usage function of TestDFSIO there is no simple option to change the basedir, it seems that the default dir /benchmarks/TestDFSIO is hardcoded in the jar itself.
And it is WAS !
The possibility to change the output dir of TestDFSIO was asked in MAPREDUCE-1614 and incorporate in MAPREDUCE-1832. So now, it possible to use :
-Dtest.build.data=/path/of_output_dir
Example :
hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-tests.jar TestDFSIO -Dtest.build.data=/user/ambari-qa/TestDFSIO -write -nrFiles 10 -fileSize 1000 -resFile /root/dfsio_result.log 18/12/18 13:49:59 INFO fs.TestDFSIO: TestDFSIO.1.8 18/12/18 13:49:59 INFO fs.TestDFSIO: nrFiles = 10 18/12/18 13:49:59 INFO fs.TestDFSIO: nrBytes (MB) = 1000.0 18/12/18 13:49:59 INFO fs.TestDFSIO: bufferSize = 1000000 18/12/18 13:49:59 INFO fs.TestDFSIO: baseDir = /user/ambari-qa/TestDFSIO 18/12/18 13:50:00 INFO fs.TestDFSIO: creating control file: 1048576000 bytes, 10 files 18/12/18 13:50:01 INFO fs.TestDFSIO: created control files for: 10 files ... 18/12/18 13:50:02 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: X.X.X.X:8020, Ident: (HDFS_DELEGATION_TOKEN token 33 for ambari-qa) ... 18/12/18 13:50:36 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write 18/12/18 13:50:36 INFO fs.TestDFSIO: Date & time: Tue Dec 18 13:50:36 UTC 2018 18/12/18 13:50:36 INFO fs.TestDFSIO: Number of files: 10 18/12/18 13:50:36 INFO fs.TestDFSIO: Total MBytes processed: 10000.0 18/12/18 13:50:36 INFO fs.TestDFSIO: Throughput mb/sec: 145.73223159766246 18/12/18 13:50:36 INFO fs.TestDFSIO: Average IO rate mb/sec: 153.26971435546875 18/12/18 13:50:36 INFO fs.TestDFSIO: IO rate std deviation: 39.996241684601024 18/12/18 13:50:36 INFO fs.TestDFSIO: Test exec time sec: 35.042 18/12/18 13:50:36 INFO fs.TestDFSIO:
Conclusion
Basic actions, like benchmarks should not change the default configuration of your cluster. Always try to tune/custom your basics action to fit your cluster rather than the opposit .
Created on 12-06-2019 12:09 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
The following always worked for me:
kinit -kt hdfs.keytab hdfs
hadoop fs -mkdir /benchmarks
hadoop fs -chmod 0777 /benchmarks
You can always lock down the directory permissions to only allow a certain group to write to this directory.