Member since 
    
	
		
		
		05-02-2019
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                319
            
            
                Posts
            
        
                145
            
            
                Kudos Received
            
        
                59
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 8183 | 06-03-2019 09:31 PM | |
| 2189 | 05-22-2019 02:38 AM | |
| 2825 | 05-22-2019 02:21 AM | |
| 1686 | 05-04-2019 08:17 PM | |
| 2104 | 04-14-2019 12:06 AM | 
			
    
	
		
		
		09-04-2017
	
		
		09:18 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 The 1-day essentials course is available for free at http://public.hortonworksuniversity.com/hdp-overview-apache-hadoop-essentials-self-paced-training in a self-paced format.  Enjoy and good luck on the exam! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-04-2017
	
		
		09:15 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 As https://hortonworks.com/services/training/certification/hca-certification/ states, "the HCA certification is a multiple-choice exam that consists of 40 questions with a passing score of 75%".  Good luck!! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-31-2017
	
		
		12:34 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Yep, this could work, but for a big cluster I could imagine this being time-consuming.  The initial recursive listing (especially since it will represent down to the file level) could be quite large for any file system of any size.  The more time-consuming effort would be to run the "hdfs dfs -count" command over and over and over.  But... like you said, this should work.  Preferably, I'd want the NN to just offer a "show me all quoto details" or at least just "show me directories w/quotas".  Since this function is not present, Maybe there is a performance hit for NN to quickly determine this that I'm not considering as seems lightweight to me.  Thanks for your suggestion. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-31-2017
	
		
		09:20 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 The HDFS Quota Guide, http://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html, shows how to list details of quotas at a specific directory where the quota is listed, but is there a way to see all quotas with one command (or at least a way to list all directories that have quotas, something like the way you can list all snapshottable dirs, which I could then programmatically iterate through and check individual quotas?  My "hunch" was that I could just check on the / directory and see a roll-up of the two specific quotas showed first, but as expected it is only showing the details of that dir's quota (if it exist).  [hdfs@node1 ~]$ hdfs dfs -count -v -q /user/testeng
       QUOTA       REM_QUOTA     SPACE_QUOTA REM_SPACE_QUOTA    DIR_COUNT   FILE_COUNT       CONTENT_SIZE PATHNAME
         400             399            none             inf            1            0                  0 /user/testeng
[hdfs@node1 ~]$ hdfs dfs -count -v -q /user/testmar
       QUOTA       REM_QUOTA     SPACE_QUOTA REM_SPACE_QUOTA    DIR_COUNT   FILE_COUNT       CONTENT_SIZE PATHNAME
        none             inf       134352500       134352500            1            0                  0 /user/testmar
[hdfs@node1 ~]$ 
[hdfs@node1 ~]$ 
[hdfs@node1 ~]$ hdfs dfs -count -v -q /            
       QUOTA       REM_QUOTA     SPACE_QUOTA REM_SPACE_QUOTA    DIR_COUNT   FILE_COUNT       CONTENT_SIZE PATHNAME
9223372036854775807 9223372036854775735            none             inf           49           23          457221101 /
[hdfs@node1 ~]$  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
			
    
	
		
		
		07-06-2017
	
		
		01:18 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Great question and unfortunately, I don't think there is a well agreed upon formula/calculator out there as "it depends" is so often the rule.  Some considerations are that the datanode doesn't really know about the directory structure; it just stores (and copies, deletes, etc) blocks as directed by the datanode (often indirectly since clients write actual blocks).  Additionally, the checksums at the block level are actually stored on disk alongside the files for the data contained in a given block.  It looks like there's some good info in the following HCC Q's that might be of help to you.  https://community.hortonworks.com/questions/64677/datanode-heapsize-computation.html  https://community.hortonworks.com/questions/45381/do-i-need-to-tune-java-heap-size.html  https://community.hortonworks.com/questions/78981/data-node-heap-size-warning.html  Good luck and happy Hadooping! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-22-2017
	
		
		06:37 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Instead of   year as (year:int)  try  (int) year as castedYear:int
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-07-2017
	
		
		08:06 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Excellent.  Truthfully, the case sensitivity is a bit weird in Pig -- kind of like the rules of the English language.  Hehe! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-06-2017
	
		
		03:25 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Regarding the on-demand offerings we have, we do have an HDP Essentials course, but currently it is only available via the larger, bundled Self-Paced Learning Library described at https://hortonworks.com/self-paced-learning-library/.  We are working towards offering individual on-demand courses, but not there yet.  You could register for it individually via our live (remote in most cases) delivery options shown at https://hortonworks.com/services/training/class/hadoop-essentials/.   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-04-2017
	
		
		08:52 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I'd raise a separate HCC question for help with that.  That way we'll get the targeted audience and your Q's won't be buried within this one that most will read as a cert question.  That's a fancy way to say I haven't set that particular version up myself and wouldn't be much help until after I got my hands dirty with it.  😉  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-04-2017
	
		
		04:38 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 It did the trick for me.  I sure hope it helps out @Joan Viladrosa, too!  Thanks, Sriharsha! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













