Member since 
    
	
		
		
		05-16-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                785
            
            
                Posts
            
        
                114
            
            
                Kudos Received
            
        
                39
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2328 | 06-12-2019 09:27 AM | |
| 3578 | 05-27-2019 08:29 AM | |
| 5724 | 05-27-2018 08:49 AM | |
| 5243 | 05-05-2018 10:47 PM | |
| 3113 | 05-05-2018 07:32 AM | 
			
    
	
		
		
		02-20-2018
	
		
		09:53 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 does your location - hdfs path resides on S3 ?      hdfs://ip-10-0-1-138.eu-central-1.compute.internal/files/test/avro'; 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-20-2018
	
		
		09:45 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 1 . just run plain sqoop list-tables   2. see if the port is listiening - 5432     3. is your jdbc jar in place   postgresql-9.2-1002.jdbc4.jar (Example )  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-19-2018
	
		
		11:08 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 For datanode block count threshold , trying run the balancer see if  that fixes your problem  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-19-2018
	
		
		11:07 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 having too many small files in the hadoop cluster is against its mantra   few large files works best in hadoop cluster.   I will provide the below link that explains why too many small files is not good for hadoop cluster.      https://blog.cloudera.com/blog/2009/02/the-small-files-problem/     Just curious to what type of small files are those if it is parquet format there are code in github that can merge those files and keep em in the cluster based on your data block size     
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-19-2018
	
		
		10:53 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 you might want to use the keytab to avoid expiration of the ticket in the kerberos .  check if you have valid kerberos ticket ?         https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_sg_kadmin_kerberos_keytab.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-19-2018
	
		
		10:46 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 transport exception is generic , its not able to communicate with HiveServer2.  Check if you have enough resouce vcores /memory  available ? possible in yarn web ui   Check if you have enough memory in the host (linux )  were you have deployed HiveServer2  -  Check if you can perform ssh into the host were HiveServer2 is runining ?   Finally see if you have HiveServer2 is up and runining in green  status ?   How many roles are in the host were you had deployed HiveServer2 ?    Can you provide me the full stack trace of HiveServer2 log - Check if you have any Out of memory being thrown again as you had heap issue before ?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-13-2018
	
		
		09:28 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Just quick info you can run pig in local mode as well as in mapreduce mode ,  By default, load looks for your data on HDFS in a tab-delimited file using the default load function PigStorage.    also if you start you pig -x which local mode it will look for local fs .   Nice that you found the fix. @SGeorge , 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-13-2018
	
		
		06:53 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Cloudera learning     - Did you had a chance to raise the datanode bandwidh , Datanode heapsize , increase the replication work multiplier before kicking of the decommision . this will certainly increase the performance.   Also if your decommision is runining for ever  i would suggest you to commission it back and perform decommision it again.   - 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-13-2018
	
		
		06:36 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Could you share the cloudera scm server logs and agent logs full stack trace that has exceptions or error .   if the directory is empty that means something in cloudera scm agent   we can narrow down it if you provide the above logs.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-07-2018
	
		
		08:02 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 try this and please let me know if that fixes       hadoop jar BigData/mbds_anagrammes.jar org.mbds.hadoop.anagrammes.Anagrammes /mot.txt /rs 
						
					
					... View more