Member since 
    
	
		
		
		01-23-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                114
            
            
                Posts
            
        
                19
            
            
                Kudos Received
            
        
                4
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2811 | 03-26-2018 04:53 AM | |
| 31309 | 12-01-2017 07:15 AM | |
| 1259 | 11-28-2016 11:30 AM | |
| 2188 | 10-25-2016 11:26 AM | 
			
    
	
		
		
		07-31-2017
	
		
		01:52 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Rakesh Enjala we were getting the similar issue, where all of our blocks under HDFS were coming up as Under Replicated Blockshdfs-under-replicated-blocks.png   the default value for ipc.maximum.data.length is 67108864Bytes (64MB) from https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/core-default.xml in our case we were getting this as about 100MB to avoid the issue we have increased the value to 128MB and able to get the cluster back to normal but before this we have done some feasts 🙂 and which caused us some unexpected behaviors in our cluster including the data loss this is happened due to:  1) we were thinking deleting the under replicated blocks using hdfs fsck / -delete will delete only the under replicated blocks which it did but in our case we lost the some of the data as well due to the ipc.maximum.data.length issue NameNode doesn't have the actual metadata because of this we lost the blocks (data) but the files were existing with 0Bytes.  2) One of the design issues we have in our cluster was we only have a single mount point (72TB) for Datanodes which is a big mistake where it have been made at least into 6  each with 12TB.  3) Never run the hdfs fsck / -delete when you see the Requested data length 97568122 is longer than maximum configured RPC length 67108864 from the NameNode logs.  Hope this helps someone 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-28-2017
	
		
		08:19 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 
	We were into the same scenario where Zeppelin was always launching the 3 Containers in YARN even after having the Dynamic allocation parameters enabled from Spark but Zeppelin is not able to pick these parameters,   
	To get the Zeppelin to launch more than 3 containers (the default it is launching) we need to configure in the Zeppelin Spark interpreter  spark.dynamicAllocation.enabled=true   
	spark.shuffle.service.enabled=true  
	spark.dynamicAllocation.initialExecutors=0   
	spark.dynamicAllocation.minExecutors=2  --> Start this value with the lower number, if not it will launch number of the minimum containers specified and will only use the required containers (memory and VCores) and rest of the memory and VCores will be marked as reserved memory and causes memory issues  
	spark.dynamicAllocation.maxExecutors=10  
	And it is always good to start with less executor memory  (e.g 10/15g) and more executors (20/30)  Our scenario we have observed that giving the executor memory (50/100g) and executors as (5/10) the query took 3min 48secs (228sec) --> which is obvious as the parallelism is very less and reducing the executor memory (10/15g) and increasing the executors (25/30) the same query took on 54secs.   Please note the number of executors and executor memory are usecase dependent and we have done few trails before getting the optimal performance for our scenario. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-20-2017
	
		
		01:18 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@John Cod As given above Hive metastore holds the details related to metadata (columns, datatypes, compression, input and output formats and many more that includes the HDFS location of the table and Database as well) with this information any tools/services that connects with Hive will invoke a NameNode call to get the Metadata (about the files, directories and the corresponding blocks etc) which is pretty much needed for the jobs that will be launched by Hive. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-12-2017
	
		
		11:15 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@ilhami  Kalkan How much data do you have on these policies? and most of the times it occurs  due to data load.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-09-2017
	
		
		11:05 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @Sridevi Kaup it is 2.4.2 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-28-2016
	
		
		11:30 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @Nikolay Kanov You can do the Ambari Agents restarts and Sqoop jobs won't be having any interference as it doesn't have anything to do with Ambari Agent restarts. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-11-2016
	
		
		10:12 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Imtiaz Yousaf please set the properties given in http://stackoverflow.com/questions/24390227/hadoop-jobs-fail-when-submitted-by-users-other-than-yarn-mrv2-or-mapred-mrv1 hope this helps...  Please do let me know if you still have the issue after doing the changes.  Thanks   Venkat 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-01-2016
	
		
		03:17 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Is the issue still exists? if yes can you please share the logs of ambari. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-25-2016
	
		
		11:26 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Can you please start and stop the ambari-server and ambari-agent where kafka is running and also form ambari DB check the hostcomponentstate table and look for the status of KAFKA service if it is still found in this table then start the ambari-agent and server and run the Delete command once again, it should help with the issue.  If not please let me know the error you are getting. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-25-2016
	
		
		09:29 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Laurent lau  Did you also update your ambari-agents to the latest version same as Ambari Server? if not please do that, if yes then stop Ambari Server and Ambari Agents and then start these services again and also make sure that the hostnames and the corresponding IP Adderesses didn't change.  Thanks  Venkat 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		- « Previous
- Next »
 
        













