Member since 
    
	
		
		
		12-06-2022
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                31
            
            
                Posts
            
        
                2
            
            
                Kudos Received
            
        
                1
            
            
                Solution
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2139 | 06-08-2023 11:41 PM | 
			
    
	
		
		
		09-23-2025
	
		
		07:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi, I'm in the same situation. Are there any problems when you add the DataNode back after decomissioning and delete data folder? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-19-2025
	
		
		01:50 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi, @quangbilly79   Yes, you can continue to use HDFS normally while the Balancer is running. The Balancer only moves replicated block copies between DataNodes to even out disk usage; it does not modify the actual data files. Reads and writes are fully supported in parallel with balancing, and HDFS ensures data integrity through replication and checksums. The process may add some extra network and disk load, so you might see reduced performance during heavy balancing. There is no risk of data corruption caused by the Balancer. You don’t need to wait — it’s safe to continue your normal operations. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-22-2024
	
		
		12:32 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @quangbilly79 any advanced config will be treated as CM advanced config, CM is overriding it, so I tried the same config in my test cluster and it will be stored in the process directory yarn-site.xml file.  You can ssh to your Yarn NM node and check the latest yarn process directory for the same and find the shuffle property in it.  Example path:   /var/run/cloudera-scm-agent/process/415-yarn-NODEMANAGER/yarn-site.xml    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-26-2024
	
		
		01:05 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @quangbilly79, Thanks for reaching out to the Community.  Your best course of action is to email certification@cloudera.com. The certification team will look into the incident and reply to you directly. 
 CC: @Dgati  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-19-2023
	
		
		04:36 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 The node must have a NodeManager role to take part of the processing, Spark gateway, and Yarn Gateway 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-16-2023
	
		
		10:18 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 GoTo  Hosts. --> All Hosts --> Go To That Host Which Have "Memory Overcommit Validation Threshold" --> Resources     
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-08-2023
	
		
		11:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I've successfully setup Spark 3.3.0 on CDH 6.2 (we used YARN). Here are some important step  1. Back up the current spark come from Cloudera package (v2.4.0 I think) at /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark  2. Download the spark version from Spark homepage, for ex "spark-3.3.0-bin-hadoop3.tgz". Extract, delete old spark folder and replace with new spark folder (rename it to "spark") at /opt/cloudera/parcels/CDH-6.2.0-1.cdh6.2.0.p0.967373/lib/spark  3. Copy all the config files from old spark conf folder to the new spark conf folder         4. Copy the Yarn-related config file into spark conf folder too      4.1. Copy file spark-3.3.0-yarn-shuffle.jar from spark/yarn to spark/jars folder  5. Make some modifications to spark-default.conf file, mostly disable log and point to the right jar folder      6. Modify some yarn config like below (yarn-site.xml)              7. Restart the cluster and run spark-shell command. Run some queries for testing. You could modify the yarn-site.xml file in the spark conf folder directly to make sure.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-05-2023
	
		
		07:48 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Oh, I don't have user vega and group vega in my local OS at all 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-05-2023
	
		
		07:28 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 How do you check hive meta store version on Cloudera? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-19-2023
	
		
		02:21 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Yes, User group mapping should be across the cluster nodes not only on name-node. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        






