Member since 
    
	
		
		
		11-24-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                223
            
            
                Posts
            
        
                10
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		02-06-2017
	
		
		02:47 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Are snapshots the preferred way for taking backups in Hadoop production environments?  Are snapshots used for taking backups for both namenode metadata (fsimage, edits) as well as datanode hdfs data. I would be very interested to know how snapshots are used in forum members hadoop environments.  Appreciate the feedback. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
- 
						
							
		
			HDFS
			
    
	
		
		
		02-03-2017
	
		
		07:59 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 But are people going to access only one block? Big Data itself implies processing of thousands of blocks.  So why would the faster access of one single block matter? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-03-2017
	
		
		07:52 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 In the manual installation as cited in my first post, where is the HDP part? It all seems to be the installation/setup of only Apache Hadoop and its suite of products (HDFS, MR, YARN, Zookeeper, HBASE, HIVE etc) ie I don't see anything really different from a standard Apache Hadoop install/setup.   In that manual installation sequence I don't see any install of Ambari either.   What is Hortonworks about the manual installation?  Appreciate the clarification. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-03-2017
	
		
		06:38 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 BTW what is the actual advantage of a co-located client? It only stores the first block of the file at the client/datanode right? The rest of the blocks are distributed across the HDFS. So what is the big advantage of storing the first block? Does that really help performance?  Appreciate the insights. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-03-2017
	
		
		06:32 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 So what is required for the edge node to connect to the cluster : hadoop software, core-site.xml, hdfs-site.xml, ... and what else ?  Appreciate the clarification. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-03-2017
	
		
		04:34 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Is the Hortonworks Data Platform installation/setup as outlined under   http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_installing_manually_book/content/ch_getting_ready_chapter.html  the standard way to setup Hortonworks Hadoop platform?  I can see that this is listed as the 'manual setup'.   So is the automated way to install Ambari and then setup the cluster through Ambari?  Appreciate the clarification. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		02-03-2017
	
		
		02:52 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hello, any response on whether the 'edge node' is a datanode?  Appreciate the feedback. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-01-2017
	
		
		06:14 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 If moving files into hdfs from datanode will not prevent distribution then when does co-located client dynamic work?  Also is the edge node that you mention a datanode? If not is it simply a machine with hadoop software to facilitate interaction with hdfs?  Appreciate the feedback. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-01-2017
	
		
		01:30 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 In production setups are files loaded into HDFS from a particular machine?  If so, if that machine were also a data node then would not that machine be identified as a co-located client - thus prevent data distribution across the cluster?  Or is the standard practice to load the files from the name node host?  Or what other practice is commonly used for loading files into HDFS?  Appreciate the insights. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
- « Previous
- Next »
 
        













