Member since 
    
	
		
		
		07-16-2015
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                177
            
            
                Posts
            
        
                28
            
            
                Kudos Received
            
        
                19
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 16703 | 11-14-2017 01:11 AM | |
| 62683 | 11-03-2017 06:53 AM | |
| 4815 | 11-03-2017 06:18 AM | |
| 14306 | 09-12-2017 05:51 AM | |
| 2412 | 09-08-2017 02:50 AM | 
			
    
	
		
		
		11-03-2017
	
		
		06:43 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 This issue just means that your shell action has exited with a error code (different from 0).  If you want to know the reason then you need to add logging inside the shell script for knowing what happened.     Be aware that the scipt execute localy on a data-node. The log you made with the script will be on that particular data-node. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-03-2017
	
		
		06:38 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Alternatively you could search around "yarn queue" and ressource allocation.     This will not "restrict" the number of mappers or reducers but this will control how many can run concurrently by giving access to only a subset of the available resources.       
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-03-2017
	
		
		06:31 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 First : save the namenode dir content.  Second : can you launch the second namenode only ? Does it start ?  If yes, you should be able to start the data-nodes and get access to the data. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-03-2017
	
		
		06:18 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi,     The concept of Hive partition do not map to HBase tables.  So if you want to have HBase as the storage then you will need to workaround your use case.     You could try to use "one HBase table" having a row key constructed with the partition value. That way you should be able to query your HBase table using the row key and avoid a full scan of the table.     Or you could have one HBase table per "partition" (this also mean one hive table per partition).     Or you could see that HBase do not answer your need and stay in Hive ?     regards,  Mathieu    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-25-2017
	
		
		02:57 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I think what you search is a configuration located inside the "core-site.xml" file (in HDFS configuration).  search for "proxyuser" on the documentation of Cloudera.     regards,  Mathieu 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-12-2017
	
		
		05:51 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Not sure this information is available.     You could go with the "yarn logs" command  or go with the basic way using command line :  - pdsh to distribute the same command on every data-node  - launch a find on the container id     regards,  mathieu             
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-08-2017
	
		
		02:50 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I believe this wait time of 30s is hard coded into the cloudera agent.  I don't think we can alter it other than doing a real dirty modification which I wouldn't recommend.     regards,  Mathieu    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-11-2017
	
		
		06:15 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 As far as I understand how Impala works, that is the expected behaviour.  It is indeed intended for speeding up later queries that use the same sets of data.    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-25-2017
	
		
		12:52 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,     I personaly don't know of that possibility.  But you can reference a morphline on a network share accessible from all nodes as a workaround (Guess you already know that).     regards,  Mathieu          
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-12-2017
	
		
		04:45 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 From my understanding when you use the Sentry HDFS synchronization plugin you only need to set the following ACLs :  hive:hive / 771     https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_sg_hiveserver2_security.html#concept_vxf_pgx_nm  https://www.cloudera.com/documentation/enterprise/latest/topics/sg_sentry_service_config.html#concept_z5b_42s_p4__section_lvc_4g4_rp     Then it is the plugin that will manage the other permission according to permissions granted in Sentry.     If you set the permissions yourself then there is not point in using the Sentry HDFS synchronization plugin. 
						
					
					... View more