Member since 
    
	
		
		
		06-17-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                61
            
            
                Posts
            
        
                20
            
            
                Kudos Received
            
        
                4
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2604 | 01-21-2017 06:18 PM | |
| 3097 | 08-19-2016 06:24 AM | |
| 2032 | 06-09-2016 03:23 AM | |
| 3777 | 05-27-2016 08:27 AM | 
			
    
	
		
		
		08-01-2016
	
		
		07:09 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks a lot Kuldeep i agree and thats why i wanted suggestions from experts like you 🙂 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-01-2016
	
		
		06:11 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi Team,  I have 3 virtual machines in HDP cluster ,if i have huge capacity in data nodes disk in TBs
so can i use the same disk with diff mount points to store Data nodes data, NN namenode data, SN data, JT data (master node data) and /usr and /var .  I know then if my disk has some issue then all data will be affected   
basically i wanted to know if my data node disks have lot of space in TBs, so do you recommend creating diff mounts on same data node disks for diff purposes like /usr,/var and storing NN SN JT data  Also each HDP version data is in /usr/hdp 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
- 
						
							
		
			Apache HBase
			
    
	
		
		
		07-19-2016
	
		
		06:02 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 We can store application related data and logs on SAN/NAS  However
SAN/NAS are not at all recommended for I/O sensitive and CPU bound jobs , that
is to avoid bottleneck situations while reading data from disk or from network
or in processing data   So
for Logs/application data --> SAN/NAS  Data
nodes data --> DAS with JBOD
configuration NO RAID   NN/SN/JT nodes  --> should be highly available [ RAID
5/10(depends on usecase) ]  Hadoop
is a scale out and shared nothing architecture  http://www.bluedata.com/blog/2015/12/separating-hadoop-compute-and-storage/  https://community.emc.com/servlet/JiveServlet/previewBody/41473-102-1-132603/Virtualizing%20Hadoop%20in%20Large%20Scale%20Infrastructures.pdf  Also I understand  sometimes
true cost of DAS is also more considering Hadoop replication , but this is how
Hadoop is thriving (One of the key tenets of Hadoop is to bring the compute to
the storage instead of the storage to the compute.) 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-19-2016
	
		
		05:55 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Sbandaru:   i researched over this more deeply and conclusion is , we don't need edge node   
 We don’t
need edge node if Hadoop cluster and application are in same network   its only needed when hadoop cluster and application are in diff network , at that time edge node acts as a gateway to hadoop cluster ( like a proxy )   thanks for your inputs  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-16-2016
	
		
		03:34 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi Team,
We are going to deploy HDP 2.3.4 for Big Env setup
Can Some one   Please explain me the architecture of Edge node in hadoop .   I am able to find only the definition on the internet. I have some queries  
1)What is edge node?   2) when and why do we need it ?   3) does every production cluster contain this edge node?   4) Does the edge node a part of the cluster (What advantages do we have if it is inside the cluster . Does it store any blocks of data in hdfs. any performance improvement?  5)Should the edge node be outside the cluster .   6) Please refer any docs where i can know about it. Preferably Hortonworks docs 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
			
    
	
		
		
		07-13-2016
	
		
		05:32 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 will be helpful 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-13-2016
	
		
		02:20 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 answer looks goo d.. Thanks for your answer can you please advise how to decide DiskIO in cluster ?  which factors to consider for Disk I/O calculation ? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-13-2016
	
		
		09:31 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi Team,  can someone kindly advise How to plan a hortonworks hadoop cluster if my application is not running any map reduce jobs .. and i will be loading 250 GB data in hbase  i understand we need to take care of below points   
 How to plan my storage?  what to use disks or RAID for NN datanodes?  How to plan my CPU?  How to plan my memory?  How to plan the network bandwidth?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
- 
						
							
		
			Apache HBase
 
        













