Member since 
    
	
		
		
		01-20-2014
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                578
            
            
                Posts
            
        
                102
            
            
                Kudos Received
            
        
                94
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 6688 | 10-28-2015 10:28 PM | |
| 3557 | 10-10-2015 08:30 PM | |
| 5643 | 10-10-2015 08:02 PM | |
| 4100 | 10-07-2015 02:38 PM | |
| 2883 | 10-06-2015 01:24 AM | 
			
    
	
		
		
		09-08-2014
	
		
		09:32 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Thank you for the feedback.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-08-2014
	
		
		05:49 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Like the definitive guide says "Hadoop allows the user to specify a  combiner function to be run on the map output, and the combiner function’s  output forms the input to the reduce function. "  . Frequently the code in the reducer and combiner is similar but doesn't  have to be.       Your question is unclear.   Can you elaborate a bit?      
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-08-2014
	
		
		05:16 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							If you are only looking to learn, you are fine with using multiple VMs on  the same host. But performance will be poor if the VMs are starved for CPU  or if they share disks.    Looks like you are just beginning to use Hadoop, so I would suggest first  getting up to speed with installation, and configuration rather than  performance. Get yourself a copy of these two books:    - Hadoop Operations / Eric Sammer  http://shop.oreilly.com/product/0636920025085.do    - Hadoop: The Definitive Guide  http://shop.oreilly.com/product/9780596521981.do    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-08-2014
	
		
		05:11 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 We see a lot of these in the JobTracker jstack. So the namenode is responding.     "DataStreamer for file /tmp/hadoop-hadoop-user/7418759843_pipe_1371547789813_7CC40A5EC84074F51068D326FE4B44CD/_logs/history/job_201409040312_85799_1409897033005_hadoop-user_%5B3529B6C5248F26FE0B927AADBA7BDA41%2F7E4BD3F9FCBCBE4B block BP-2096330913-10.250.195.101-1373872395153:blk_468657822786954548_993063000" daemon prio=10 tid=0x00007f1f2a96f000 nid=0x7b56 in Object.wait() [0x00007f1ebc9e7000]     java.lang.Thread.State: TIMED_WAITING (on object monitor)          at java.lang.Object.wait(Native Method)          at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:464)          - locked <0x0000000625121b00> (a java.util.LinkedList)     Have you noticed a large spike in number of blocks and have you tuned your NN heap to deal with this rise? Did the JT pause only began when you turned on compression of fsimage? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-08-2014
	
		
		01:54 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							> Installed: hadoop-2.3.0+cdh5.1.2+816-1.cdh5.1.2.p0.3.el6.x86_64  (@cloudera-cdh5)  > Installed: hadoop-hdfs-2.3.0+cdh5.1.2+816-1.cdh5.1.2.p0.3.el6.x86_64  (@cloudera-cdh5)    remove those two packages first, then ensure your CDH5 repo points to  5.0.3, not 5 (which will be the latest in 5.x)    # yum remove hadoop hadoop-hdfs  # yum clean all  # yum makecache  # yum list | grep cdh5.1.2 (should not list anything)      
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-08-2014
	
		
		12:14 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							The map task's local output is not stored within HDFS, rather in temporary  directories on that specific node (see property mapreduce.cluster.local.dir)  written using standard file I/O    https://hadoop.apache.org/docs/r2.2.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-07-2014
	
		
		11:14 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							The error is originating from ZooKeeper. Which directory did you move:  /var/log or /var/log/hbase? Did you try restarting the entire cluster?    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-04-2014
	
		
		04:43 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							How long does this pause normally last? If you are able to, capture  3-5 jstack of the jobtracker spaced a few seconds apart and upload it here  (pastebin or gist)    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-04-2014
	
		
		04:15 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Hello Charles, as a beginner it would be easier if you experimented with  Hadoop on AWS instances before buying hardware. You can begin by building a  simple 3-4 node cluster. The hardware requirements depend on your planned  work but you can begin with nodes with 8GB RAM and storage based in your  data set. Get familiar with the software and then look to scale upward    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













