Member since
01-19-2017
3679
Posts
632
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 946 | 06-04-2025 11:36 PM | |
| 1552 | 03-23-2025 05:23 AM | |
| 772 | 03-17-2025 10:18 AM | |
| 2785 | 03-05-2025 01:34 PM | |
| 1833 | 03-03-2025 01:09 PM |
10-10-2020
02:08 PM
1 Kudo
@mike_bronson7 Once you connect the 10 new data nodes to the cluster Ambari automatically distributes the common hadoop config file i.e hdfs-site.xml,Mapred-site.xml,yarn-site.xml etc to those new nodes so they can start receiving data blocks. My suggestion as a workaround would be to add these 10 new datanodes hostnames FQDN or IP (separated by a newline character) in the dfs.exclude file on the NameNode machine, edit the <HADOOP_CONF_DIR>/dfs.exclude file and where <HADOOP_CONF_DIR> is the directory for storing the Hadoop configuration files. For example, /etc/hadoop/conf. First, ensure the DNS resolution is working or your /etc/hosts are updated and the passwordless connection is working with those hosts. Once the 10 new nodes are in the dfs.exclude file the namenode will consider them as bad nodes so no data will be replicated to them as long as these hosts remain in the dfs.exclude file once you have updated the NameNode with the new set of excluded DataNodes. On the NameNode host machine, execute the following command: su <HDFS_USER>
hdfs dfsadmin -refreshNodes where <HDFS_USER> is the user owning the HDFS services That should do the trick, once these hosts are visible in Ambari turn maintenance mode on so you don't receive any alerts The day you will decide to add/enable these 10 new datanodes you will simply cp or mv the dfs.exclude to dfs.include file located <HADOOP_CONF_DIR>/dfs.include these nodes will start heartbeating and notifying the NameNode that thes DataNodes are ready to start receiving files and participating in the data distribution in the cluster. On the NameNode host machine remember to execute the following command: su <HDFS_USER>
hdfs dfsadmin -refreshNodes Don't forget to disable Maintenance mode on the new datanodes and remove them from dfs.exclude file if you didn't rename or delete it. Run the HDFS Balancer a tool for balancing the data across the storage devices of an HDFS cluster. sudo -u hdfs hdfs balancer The above balancer command has a couple of options either threshold or again the dfs.include and dfs.exclude see explanation below Include and Exclude Lists When the include list is non-empty, only the datanodes specified in the list are balanced by the HDFS Balancer. An empty include list means including all the datanodes in the cluster. The default value is an empty list. [-include [-f <hosts-file> | <comma-separated list of hosts>]] The datanodes specified in the exclude list are excluded so that the HDFS Balancer does not balance those datanodes. An empty exclude list means that no datanodes are excluded. When a datanode is specified in both in the include list and the exclude list, the datanode is excluded. The default value is an empty list. [-exclude [-f <hosts-file> | <comma-separated list of hosts>]] If no dfs.include file is specified, all DataNodes are considered to be included in the cluster (unless excluded explicitly in the dfs.exclude file). The dfs.hosts and dfs.hosts.exclude properties in hdfs-site.xml are used to specify the dfs.include and dfs.exclude files. Hope that helps
... View more
10-10-2020
11:35 AM
1 Kudo
@mike_bronson7 Always stick to the Cloudera documentation. Yes !!! there is no risk in running that command I can understand your reservation.
... View more
10-10-2020
11:31 AM
@lxs I have helped resolve this kind of issue a couple of times. Can you help with screenshots of your configuration of the sandbox? Memory Network Splash screen After restarting the sandbox and any other screenshot you deem important
... View more
10-10-2020
11:26 AM
@vinod_artga The first step is to check the Cloudera upgrade path using the My environment matrix calculator See screenshot below After filling in all the information request this generates for you a report and warnings like Warning Upgrades from CDH 5.12 and lower to CDP Private Cloud Base are not supported. You must upgrade the cluster to CDH versions 5.13 - 5.16 before upgrading to CDP Private Cloud Base. Warning For upgrades from CDH 5 clusters with Sentry to Cloudera Runtime 7.1.1 (or higher) clusters where Sentry privileges are to be transitioned to Apache Ranger, the cluster must have Kerberos enabled before upgrading It also gives you comprehensive details about the best approach and component incompatibilities, this is your source of true I would suggest you try it and revert HTH
... View more
10-10-2020
10:50 AM
1 Kudo
@bvishal SmartSense Tool (HST) gives all support subscription customers access to a unique service that analyzes cluster diagnostic data, identifies potential issues, and recommends specific solutions and actions. These analytics proactively identify unseen issues and notify customers of potential problems before they occur. That is okay as you are just testing and you don't need to buy support which is advised when running a production environment To configure SmartSense you will need to configure the /etc/hst/conf/hst-server.ini the inputs/values you will get from Hortonworks support if you have paid for a subscription customer.smartsense.id
customer.account.name
customer.notification.email
customer.enable.flex.subscription The error you are encountering is normal and won't impact your cluster Hope that helps
... View more
10-08-2020
10:40 AM
@pazufst How Ranger policies work for HDFS Apache Ranger offers a federated authorization model for HDFS. Ranger plugin for HDFS checks for Ranger policies and if a policy exists, access is granted to user. If a policy doesn’t exist in Ranger, then Ranger would default to the native permissions model in HDFS (POSIX or HDFS ACL). This federated model is applicable for HDFS and Yarn service in Ranger. For other services such as Hive or HBase, Ranger operates as the sole authorizer which means only Ranger policies are in effect. The option for the fallback model is configured using a property in Ambari → Ranger → HDFS config → Advanced ranger-hdfs-security xasecure.add-hadoop-authorization=true The federated authorization model enables to safely implement Ranger in an existing cluster without affecting jobs that rely on POSIX permissions to enable this option as the default model for all deployments. org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=XXXXX, access=READ, inode="/user/.snapshot/user_201806150000":w93651:hdfs:drwx------ Is self-explanatory does the user w93651 exist on both clusters with valid Kerberos tickets if the cluster is kerberized? Ensure the CROSS-REALM is configured and working. Is your ranger managing the 2 clusters? HTH
... View more
10-06-2020
10:14 AM
@pazufst You should toggle to RECURSIVE for the RWX permission on the hdfs should look like this /databank/* The selection should look like Hope that helps
... View more
10-06-2020
07:25 AM
@pazufst Doesn't it look strange that permissions are /databank/.snapshot/databank_201904250000 :hdfs:hdfs:d--------- The normal permissions show be rwx Try that and revert
... View more
10-02-2020
12:12 PM
@kvinod If you have installed Yarn and MRV2 can you check the value of the below parameter in the yarn-site.xml yarn.nodemanager.aux-services stop the services and change it to look like below <property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property> Restart and let me know
... View more
09-25-2020
01:42 PM
@GangWar Can you regenerate the keytabs through Cloudare manager? That could resolve the problem if it doean't please revert with the error cêncountered?
... View more