Member since 
    
	
		
		
		01-09-2019
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                401
            
            
                Posts
            
        
                163
            
            
                Kudos Received
            
        
                80
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2595 | 06-21-2017 03:53 PM | |
| 4290 | 03-14-2017 01:24 PM | |
| 2388 | 01-25-2017 03:36 PM | |
| 3838 | 12-20-2016 06:19 PM | |
| 2101 | 12-14-2016 05:24 PM | 
			
    
	
		
		
		12-03-2015
	
		
		06:56 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Default for this is 10. I have seen it at 128 in a large (over 1000 nodes) cluster and I think this is causing load issues. What is the recommended value for this and when should this be increased from default 10.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
			
    
	
		
		
		12-02-2015
	
		
		07:19 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 When I look at HDFS audit logs, I see hbase user from HBaseMaster node accessing hdfs files and the entry I see in audit log is with 'cmd=listStatus'. We regularly see about 3 million of them per hour and we have seen a hike of 6 million of them per hour which probably may have crashed NN. Any idea what HBaseMaster is doing here or if we can reduce any of this load on NN?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache HBase
			
    
	
		
		
		11-30-2015
	
		
		07:56 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks Steve.   In our case, we are looking to set it at RM level, not necessarily even at app/AM level. So, AM fails for any reason, just don't retry AM on the same host, pick something else. Based on error, it might be good option to blacklist at RM level to not send further AMs there.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-18-2015
	
		
		12:44 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks. Will this change help with all TCP/IP communication? Or will it only help with certain communication like mapreduce shuffle ?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-17-2015
	
		
		11:37 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 ipc.server.tcpnodelay has been changed to true by default in hadoop 2.6. We are on hadoop 2.4 and would like to change it to true. What services if any require a restart for this change? Can it be set at job level for all jobs and not restart services?   With a big cluster where NN restart takes more than 60 minutes, we would like to avoid all possible restarts.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
			
    
	
		
		
		11-10-2015
	
		
		06:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 The question is more in the sense of how mapreduce AM can have a policy of not rerunning AM on the same node that failed on the first try. This is not a custom yarn app where we can decide where AM should go. If map reduce AM can't do this now, it might be better if we can drive a support ticket for it for an enhancement since with current approach, problem in single node manager can cause failed map reduce jobs.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-09-2015
	
		
		09:39 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 It has nothing to do with labels. It will be the same issue with Node Labels. Whenever an AM fails for any reason, I see that the retry is happening on the same node. If first AM failed for any node related issue, the second one also will fail for the same issue. What we are looking at is if it is possible with any config change to make sure AM retry does not happen on the same node.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-09-2015
	
		
		07:35 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have seen AM being retried on the same node where first attempt failed causing the job to fail.  There are situations where there is something wrong with the node (either with space or other issues), so any number of retries there will fail.   Is there any way to see that AM retries always go to a different Node Manager? Is the current policy to always retry on the same Node Manager?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache YARN
			
    
	
		
		
		11-04-2015
	
		
		11:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 REGISTER /tmp/tez-tfile-parser-0.8.2-SNAPSHOT.jar;
yarnlogs = LOAD '/app-logs/hdfs/logs/**/*' USING org.apache.tez.tools.TFileLoader();
lines_with_fetchertime = FILTER yarnlogs BY $2 matches '.*freed by fetcher.*';  This was the code that I used to extract specific text in logs. However, TFileLoader in tez-tools does not seem to scale up that well when we pass a folder with ton on logs. tez-tools I believe is also not part of HDP. You need to build it separately. Worked well on smaller datasets and ran into issues on bigger datasets  Thanks 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-04-2015
	
		
		09:30 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hortonworks documentation (http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_yarn_resource_mgt/content/enabling_cgroups.html) says using CGroups requires HDP running in secure mode with kerberos. However, there is no such requirement according to apache documentation. https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/NodeManagerCgroups.html   Apache documentation calls out for settings for running Cgroups without kerberos and settings required. Which documentation is incorrect? Do we support running LCE and CGroups without kerberos?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache YARN
 
        













