About bandarusridhar1

bandarusridhar1 · ‎09-23-2017

@Sree Kupp If the NameNode host has hardware problems and you need to move the NameNode to another host, you can do so as follows: If the host to which you want to move the NameNode is not in the cluster, follow the instructions in Adding a Host to the Cluster to add the host. Stop all cluster services. Make a backup of the dfs.name.dir directories on the existing NameNode host. Make sure you back up the fsimage and edits files. They should be the same across all of the directories specified by the dfs.name.dir property. Copy the files you backed up from dfs.name.dir directories on the old NameNode host to the host where you want to run the NameNode. Go to the HDFS service. Click the Instances tab. Select the checkbox next to the NameNode role instance and then click the Delete button. Click Delete again to confirm. In the Review configuration changes page that appears, click Skip. Click Add to add a NameNode role instance. Select the host where you want to run the NameNode and then click Continue. Specify the location of the dfs.name.dir directories where you copied the data on the new host, and then click Accept Changes. Start cluster services. After the HDFS service has started, Cloudera Manager distributes the new configuration files to the DataNodes, which will be configured with the IP address of the new NameNode host. Go to the HDFS service. The NameNode, Secondary NameNode, and DataNode roles should each show a process state of Started, and the HDFS service should show a status of Good. You can't disturb between multiple node unless you want to store HDFS metadata in multiple location which will be same.

bandarusridhar1 · ‎09-22-2017

@Hoang Le Capacity Scheduler’s leaf queues can also use the user-limit-factor property to control user resource allocations. This property denotes the fraction of queue capacity that any single user can consume up to a maximum value, regardless of whether or not there are idle resources in the cluster. Property: yarn.scheduler.capacity.root.support.user-limit-factor Value: 1 The default value of "1" means that any single user in the queue can at maximum only occupy the queue’s configured capacity. This prevents users in a single queue from monopolizing resources across all queues in a cluster. Setting the value to "2" would restrict the queue's users to twice the queue’s configured capacity. Setting it to a value of 0.5 would restrict any user from using resources beyond half of the queue capacity. These settings can also be dynamically changed at run-time using yarn rmadmin - refreshQueues . Please change user limit factor and try.

bandarusridhar1 · ‎09-20-2017

@Piyali Gupta Here are the steps to increase HDFS Balancer network bandwidth for faster balancing of data between nodes Article hdfs dfsadmin -setBalancerBandwidth 100000000 on all the DN and the client we ran the command below hdfs balancer -Dfs.defaultFS=hdfs://<NN_HOSTNAME>:8020 -Ddfs.balancer.movedWinWidth=5400000 -Ddfs.balancer.moverThreads=1000 -Ddfs.balancer.dispatcherThreads=200 -Ddfs.datanode.balance.max.concurrent.moves=5 -Ddfs.balance.bandwidthPerSec=100000000 -Ddfs.balancer.max-size-to-move=10737418240 -threshold 5 This will faster balance your HDFS data between datanodes and do this when the cluster is not heavily used. Couple of links to article : https://community.hortonworks.com/articles/51935/how-to-increase-hdfs-balancer-network-bandwidth-fo.html https://community.hortonworks.com/articles/43849/hdfs-balancer-2-configurations-cli-options.html Hope this helps you.

bandarusridhar1 · ‎09-20-2017

@raouia That is wrongly printed parameter. It should be 'dfs.datanode.data.dir' instead of 'dfs.data.dir'. This has been rectified in higher version of documentation. dfs.datanode.data.dir determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored. The number of comma-delimited list equals to number of disks. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/bk_command-line-installation/content/determine-hdp-memory-config.html Hope this helps you.

bandarusridhar1 · ‎09-19-2017

The reason why Ambari is unable to start Namenode smoothly is bug and below is the workaround. Issue got fixed permanently in Ambari 2.5.x. Few lines of Error message from Ambari Ops logs: File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/decorator.py", line 55, in wrapper return function(*args, **kwargs) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py", line 562, in is_this_namenode_active raise Fail(format("The NameNode {namenode_id} is not listed as Active or Standby, waiting..."))resource_management.core.exceptions.Fail: The NameNode nn2 is not listed as Active or Standby, waiting... ROOT CAUSE: https://issues.apache.org/jira/browse/AMBARI-18786 RESOLUTION: Increase the timeout in /var/lib/ambari-server/resources/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_namenode.py from this; @retry(times=5, sleep_time=5, backoff_factor=2,err_class=Fail) to this; @retry(times=25, sleep_time=25, backoff_factor=2,err_class=Fail)

bandarusridhar1 · ‎09-17-2017

@Facundo Bianco Refers to Backup the Metrics Collector Data(HBase).

bandarusridhar1 · ‎08-22-2017

@suresh krish When you see the environmental variables in your spark UI you can see that particular job will be using below property serialization. If you can't see in cluster configuration, that mean user is invoking at the runtime of the job. <code>spark.serializer org.apache.spark.serializer.KryoSerializer Secondly spark.kryoserializer.buffer.max is built inside that with default value 64m. If required you can increase that value at the runtime. Even we can all the KryoSerialization values at the cluster level but that's not good practice without knowing proper use case. Hope this helps you.

bandarusridhar1 · ‎08-17-2017

@Deepak Nayak We can submit the spark jobs on remote cluster using the livy server using Rest calls. Below are the couple of links for examples: https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-livy-rest-interface https://github.com/cloudera/livy Hope this helps you.

bandarusridhar1 · ‎08-15-2017

@arjun more Are you able to sync groups similar like users? If not, please check below parameter with your LDAP team and add them as per their request: authentication.ldap.groupObjectClass [LDAP Object Class] The object class that is used for groups. Example: groupOfUniqueNames authentication.ldap.groupMembershipAttr [LDAP attribute] The attribute for group membership. Example: uniqueMember authentication.ldap.groupNamingAttr [LDAP attribute] The attribute for group name. Checking the ambari-server logs would help you in getting the error message. Hope this helps you.

bandarusridhar1 · ‎08-10-2017

@Dhiraj Yes, delegate admin in Ranger can all "selecting and changing user/group and permissions" for all this policies. Delegate admin will have full permissions as admin but with respective to that policy only.

Online	Offline
Last Visited	‎04-28-2023 03:22 PM

Member Since	‎04-13-2016 05:38 PM
Last Visited	‎04-28-2023 03:22 PM
Posts	422
Kudos received	149

Cloudera Community

Re: yarn local cache on ssd

Re: Where to add timeout configuration for hive on...

Re: restrict user access to queues

Re: How create blueprint of existing cluster and h...

Re: Not able to run HDFS command

Re: How to Sync a new Secondary Namenode to the Cl...

Re: Yarn Capacity Scheduler: Share resource betwee...

Re: How to make data replication faster after addi...

Re: how to check disks number ?

Unable to start/restart standby namenode smoothly ...

Re: Record the location of the Metrics Collector

Re: Kryo serialization failed

Re: How to submit spark jobs from a remote machine...

Re: Ambari-server sync-ldap works fine but users a...

Re: Can delegate admin user will able to manage "s...