Member since 
    
	
		
		
		11-19-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                158
            
            
                Posts
            
        
                25
            
            
                Kudos Received
            
        
                21
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 16573 | 09-01-2018 01:27 AM | |
| 2478 | 09-01-2018 01:18 AM | |
| 6961 | 08-20-2018 09:39 PM | |
| 1284 | 07-20-2018 04:51 PM | |
| 3073 | 07-16-2018 09:41 PM | 
			
    
	
		
		
		07-30-2018
	
		
		06:53 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Michael Bronson - The terms "master/worker" don't really mean anything in Kafka terms.   17 Kafka brokers seems like a lot (we have about that many brokers in AWS handling about 2million messages per day), but yes, a minimum of 5 ZKs is encouraged to account for maintenance and hardware failure, as mentioned.   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-25-2018
	
		
		06:58 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 There is no such rule for Kafka Brokers.   Zookeeper should maintain a quorum or (n/2 + 1) total machines (of n) that agree on leader-election values and locks, that results in a total odd number to accommodate for hardware and network failure scenarios.   From "Kafka - The Definitive Guide", as well as Apache Zookeeper site, you generally will have negative side effects from having more than 5 or 7 Zookeeper servers total serving applications using it.  You should have more than 3 Zookeepers because if one goes down, you are only left with 2, which results in that "split brain". With 5 servers, two can go down, and you still have 2 servers + 1 available for the "tie breaker" vote. For 7, you can loose up to 4 zookeepers and still be good.   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-20-2018
	
		
		04:52 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 TaskTracker & JobTracker doesn't exist with YARN. The default replication factor is 3.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-20-2018
	
		
		04:51 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 What component are you asking about? What are you trying to achieve?  They typically call each other over combinations of separate protocols.   - HDFS and YARN interact via RPC/IPC.   - Ambari Server and Agents are over HTTP & REST. Ambari also needs JDBC connections to the backing database.   - Hive, Hbase, and Spark can use Thrift Server. The Hive metastore uses JDBC.   - Kafka has its own TCP protocol.    I would suggest starting on a specific component for the use case(s) you want. Hadoop itself is only comprised of HDFS & YARN + MapReduce 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-16-2018
	
		
		09:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @Sambasivam Subramanian
  By definition, an edge node is just a host only with clients installed and configured.   If you install no server services in Ambari for a host, then you will end up with an edge node for the clients that you selected.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-19-2018
	
		
		05:13 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 The configs are on the top line. It will say "Configs: " if none are customized  $ kafka-topics --describe --topic $TOPIC --zookeeper $ZOOKEEPER 
Topic:******** PartitionCount:20       ReplicationFactor:3     Configs:retention.ms=10800000
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-19-2018
	
		
		05:10 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 There is no such support for renaming  https://issues.apache.org/jira/browse/KAFKA-2333  If you want to clone, then use MirrorMaker   https://community.hortonworks.com/articles/79891/kafka-mirror-maker-best-practices.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-14-2018
	
		
		03:31 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Michael Bronson  Kafka stores the latest offsets in memory before they are sent to disk, therefore, the more memory the better, with a max of 8G.   And I would assume that the heap properties can be set from Ambari rather than individually on the broker, but I don't use Kafka from HDP, so I can't say.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-11-2018
	
		
		01:16 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 The recommendation here would be to increase the heap space allocated to the Kafka process or reduce the amount of other processes running on the same server. For example, in a production environment, the Kafka brokers should be standalone servers -- not on the same hardware as Zookeeper or other Hadoop processes.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-10-2018
	
		
		08:35 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Yes, the commands work the same assuming you have winutils.exe on your PATH as well as HADOOP_HOME and HADOOP_CONF_DIR defined as environment variables.   Windows is not as stable or as supported as Linux, however.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













