Member since 
    
	
		
		
		01-20-2014
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                578
            
            
                Posts
            
        
                102
            
            
                Kudos Received
            
        
                94
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 6677 | 10-28-2015 10:28 PM | |
| 3543 | 10-10-2015 08:30 PM | |
| 5638 | 10-10-2015 08:02 PM | |
| 4095 | 10-07-2015 02:38 PM | |
| 2875 | 10-06-2015 01:24 AM | 
			
    
	
		
		
		09-20-2014
	
		
		08:06 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							The error message here might hold the key. Can you verify why it might not  be executable? Did you change permissions at some point?      /opt/cloudera-manager/cm-5.1.2/lib64/cmf/service/common/cloudera-config.sh:  line 172:  /pkg/moip/mo10755/work/mzpl/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/meta/cdh_env.sh:  Permission denied
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-20-2014
	
		
		04:34 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Are you able to provide us the logs from the ZooKeeper instance  (/var/log/zookeeper)? It should tell us why it's not starting. Please paste  the logs into pastebin and provide the URL here. You just need to provide  the section covering the last startup attempt and the failure    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-18-2014
	
		
		07:35 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Thank you for the update, glad you were able to resolve the problem.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-18-2014
	
		
		02:31 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Since you're using VMWare, you could very well do what QuickStart basically  is.  - Create one VM and configure all services as you wish.  - Take a snapshot or export the appliance  - Clone it ten times for ten virtual machines    When you want to update CDH just update the master image and repeat the  process.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-18-2014
	
		
		02:11 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							I am not aware of how you can get the quickstart VM to work with ESXi.    Is there anything specific you need from the quickstart VM? Why not create  a blank VM with CentOS6.4 and install CDH+CM from scratch? The whole  install process is pretty easy.    http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-Installation-Guide/Cloudera-Manager-Installation-Guide.html    If you do happen to try this and run into any issues, please start a new  thread and we'll be happy to assist.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-17-2014
	
		
		07:35 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							Are you able to try the VMWare image with the Player product and let us  know if it works for you?    http://www.vmware.com/products/player  http://www.cloudera.com/content/support/en/downloads/quickstart_vms/cdh-5-1-x1.html    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-15-2014
	
		
		02:41 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							The advantage of using HAR files is not in saving of disk space but in  lesser metadata. Please read the blog link I pasted earlier.    quote:    ===    A small file is one which is significantly smaller than the HDFS block size  (default 64MB). If you’re storing small files, then you probably have lots  of them (otherwise you wouldn’t turn to Hadoop), and the problem is that  HDFS can’t handle lots of files.    Every file, directory and block in HDFS is represented as an object in the  namenode’s memory, each of which occupies 150 bytes, as a rule of thumb  . So  10 million files, each using a block, would use about 3 gigabytes of  memory. Scaling up much beyond this level is a problem with current  hardware. Certainly a billion files is not feasible.    Furthermore, HDFS is not geared up to efficiently accessing small files: it  is primarily designed for streaming access of large files. Reading through  small files normally causes lots of seeks and lots of hopping from datanode  to datanode to retrieve each small file, all of which is an inefficient  data access pattern.  ===    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-15-2014
	
		
		01:48 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							If you use HAR to combine 8 smaller files (each less than 1M), it would  occupy just one block. More than disk space saved, you save on metadata  storage (on the namenode and datanodes) and this is far more significant in  the long term for performance.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-15-2014
	
		
		01:44 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							    The block on the file system isn't a fixed size file with padding, rather it is just a unit of storage. The block's size can be maximum of 128MB (or as configured), so if a file is smaller, it will just occupy the minimum needed space.     In my previous response, I had said 8 small files would take up 3GB of space. This is incorrect. The space taken up on the cluster is still just the file size times 3 for each block. Regardless of file size, you can divide the size by the block size (default 128M) and round up to the next whole number, this will give you the number of blocks. So in this case, the 3922 byte file uses one block to store the contents.          
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-15-2014
	
		
		12:12 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 > The HDFS block size in my system is set to be 128m. Does it mean that  > if I put 8 files less than 128m to HDFS, they would occupy 3G disk  > space (replication factor = 3) ?    Yes, this is right. HDFS blocks are not shared among files.    > How could I know the actual occupied space of HDFS file ?    The -ls command tells you this. In the example below, the jar file is  3922 bytes long.    # sudo -u hdfs hadoop fs -ls /user/oozie/share/lib/sqoop/hive-builtins-0.10.0-cdh4.7.0.jar  -rw-r--r--   3 oozie oozie       3922 2014-09-14 06:17 /user/oozie/share/lib/sqoop/hive-builtins-0.10.0-cdh4.7.0.jar    > And how about I use HAR to archive these 8 files ? Can it save some  > space ?    Using HAR is a good idea. More ideas about dealing with the small files  problem is in this link  http://blog.cloudera.com/blog/2009/02/the-small-files-problem/ 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













