About ssathish

ssathish · ‎03-22-2017

Hi @heta desai The Application Master will launch one MapTask for each map split. Typically, there is a map split for each input file. If the input file is too big (bigger than the HDFS block size) then we have two or more map splits associated to the same input file. Also the memory used fro map and reduce task is RAM of Nodemanagers. Please refer to it for more details - http://ercoppa.github.io/HadoopInternals/AnatomyMapReduceJob.html

ssathish · ‎03-13-2017

Hi @Pierre Gunet-Caplain, Does the prod Queue has child queues? If so, since the prod Queue is configured as FIFO, preemption can happen between the child queues of prod Queue based on the child queue 's capacity. Preemption can happen even if he cluster is not 100% used for the case where new resource required is more than the available resource.

ssathish · ‎02-18-2017

Hi @Aruna Sameera If it is not a single node cluster, check the node where namenode has to be started by checking the conf file - hdfs-site.xml(/etc/hadoop/conf/hdfs-site.xml). Namenode node is provided in the given property in hdfs-site.xml <property> <name>dfs.namenode.http-address</name> <value>Node:port</value> </property> <property> <name>dfs.namenode.https-address</name> <value>node:port</value> </property> Once inside the node, find the namenode daemon. Namenode daemon is present in HADOOP_HOME which can be found in /etc/hadoop/conf/hadoop-env.sh # Hadoop home directory export HADOOP_HOME=${HADOOP_HOME:-/usr/hdp/current/hadoop-client} Start the namenode deamon as hdfs user sudo su --c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start namenode" hdfs If the namenode is still not started, please paste the logs which can be found at /etc/hadoop/conf/hadoop-env.sh # Where log files are stored. $HADOOP_HOME/logs by default. export HADOOP_LOG_DIR=/grid/0/log/hdfs/$USER

ssathish · ‎02-15-2017

Hi, My folder /user/testUser is encrypted. When I try to run copyFromLocal command as HDFS user on /user/testUser, i am getting following exception. Can someone please help me to resolve this. sudo su --c "hdfs dfs -copyFromLocal test.txt /user/testUser” hdfs copyFromLocal: User:hdfs not allowed to do 'DECRYPT_EEK' on ’test_key' 17/02/15 00:26:24 ERROR hdfs.DFSClient: Failed to close inode 17777 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /user/testUser/test.txt._COPYING_ (inode 17777): File does not exist. Holder DFSClient_NONMAPREDUCE_1724817926_1 does not have any open files. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3659) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3749) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3716) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:911) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:547) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2345) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554) at org.apache.hadoop.ipc.Client.call(Client.java:1498) at org.apache.hadoop.ipc.Client.call(Client.java:1398) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at com.sun.proxy.$Proxy10.complete(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:503) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:282) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176) at com.sun.proxy.$Proxy11.complete(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2442) at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2419) at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2384) at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:949) at org.apache.hadoop.hdfs.DFSClient.closeOutputStreams(DFSClient.java:981) at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1211) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:2886) at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:2903) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

ssathish · ‎01-06-2017

Hi @Sami Ahmad For Question 3, The log files location can be found out by checking hadoop-env.sh or yarn-env.sh file which are present in HADOOP_CONF_DIR which is usually /etc/hadoop/conf/ . Sample yarn-env.sh export HADOOP_YARN_HOME=/usr/hdp/current/hadoop-yarn-nodemanager export YARN_LOG_DIR=/grid/0/log/yarn/$USER export YARN_PID_DIR=/var/run/hadoop-yarn/$USER yarn.nodemanager.log-dirs present in yarn-site.xml(Inside HADOOP_CONF_DIR): Determines where the container-logs are stored on the node when the containers are running

ssathish · ‎11-30-2016

@Hoang Le The max application master resource is calculated using the value "yarn.scheduler.capacity.maximum-am-resource-percent" which is set in capacity-scheduler.xml. The default value is 0.1 which means 10% of the cluster resources. More info can be found in http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_yarn_resource_mgt/content/setting_application_limits.html

ssathish · ‎09-16-2016

Hi @Anil Bagga, Please find the following doc for upgrade. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_upgrading_hdp_manually/content/ch_upgrade_2_3.html

ssathish · ‎09-15-2016

@Juan Manuel Nieto From this apache jira - https://issues.apache.org/jira/browse/YARN-3978, we can see that they have configuration option to turn off saving of non-AM container metadata. In order to have non-AM container details, we need to set "yarn.timeline-service.generic-application-history.save-non-am-container-meta-info" and "yarn.timeline-service.generic-application-history.enabled" to TRUE and restart RM and ATS.

ssathish · ‎06-29-2016

The command to start datanode is su - hdfs -c "/usr/hdp/current/hadoop-client/sbin/hadoop-daemon.sh start datanode" More info can be found in https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_upgrading_hdp_manually/content/start-hadoop-core-13.html

ssathish · ‎06-14-2016

@Anshul Sisodia The port is 8050. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_HDP_Reference_Guide/content/yarn-ports.html Also you can verify by checking in /etc/hadoop/conf/yarn-site.xml for property 'yarn.resourcemanager.address'

Online	Offline
Last Visited	‎09-28-2018 09:37 PM

Member Since	‎09-25-2015 07:06 PM
Last Visited	‎09-28-2018 09:37 PM
Posts	46
Kudos received	136

Cloudera Community

Re: Standby Resource manager logs ( YARN )

Re: Nodemanager shutsdown soon after start - where...

Re: Why CurrentCapacity exceeds MaximumCapacity?

Re: Why it is recommended to make changes through ...

Re: Multiple Application Masters

Re: How AM decides how many mapreduce jobs wiill b...

Re: Capacity Scheduler : Preemption of containers ...

Re: Hadoop Namenode is no up

CopyFromLocal command fails when run as HDFS user ...

Re: where are hadoop log files ?

Re: Capacity Scheduler Configuration

Re: I plan an upgrade from HDP 2.3.4 to HDP 2.4.0....

Re: Timeline Server doesn't store container finish...

Re: How to up a data node

Re: What is the default Yarn resource manager port...