Member since 
    
	
		
		
		02-23-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                34
            
            
                Posts
            
        
                15
            
            
                Kudos Received
            
        
                9
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 6214 | 02-28-2019 10:37 AM | |
| 9318 | 02-27-2019 03:34 PM | |
| 5253 | 01-09-2019 07:39 PM | |
| 5340 | 01-08-2019 10:46 AM | |
| 11110 | 04-03-2018 03:17 PM | 
			
    
	
		
		
		08-25-2021
	
		
		01:07 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 hi,adar:      if both the WAL segments and the CFiles are copied duing a tablet copy,then the follower tablet will  alse flushing wal data to disk when growing up to 8M,in my opinion  there has no difference between master tablet and follower tablet during the reading and writing,is that right? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-18-2021
	
		
		06:03 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Please add kudu-spark2-tools_2.11-1.6.0.jar depending on your spark and scala version.  Use spark context instead of hive.  Apache Spark 2.3   Below is the code for your reference:  -------------------------------------  Read kudu table from pyspark with below code:     kuduDF = spark.read.format('org.apache.kudu.spark.kudu').option('kudu.master',"IP of master").option('kudu.table',"impala::TABLE: name").load()  kuduDF.show(5)     Write to kudu table with below code:     DF.write.format('org.apache.kudu.spark.kudu').option('kudu.master',"IP of master").option('kudu.table',"impala::TABLE: name").mode("append").save()  ----------------------------------------  Reference link: https://medium.com/@sciencecommitter/how-to-read-from-and-write-to-kudu-tables-in-pyspark-via-impala...     If in case you want to use Scala below is the reference link:  https://kudu.apache.org/docs/developing.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-28-2019
	
		
		12:24 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Kudu's HMS integration is a CDH 6.3 feature. If you would like to use it end-to-end, you'll have to use it with CDH 6.3.     That said, even without the Kudu's native HMS integration, in all versions of CDH, when using Impala "internal" or "managed" Kudu tables (see here for more details), Impala will create HMS metadata on behalf of Kudu. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		11-18-2019
	
		
		12:51 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I see all my Tablet Servers Healthy and the Summary by Table also shows them Healthy. Nothing in 'Recovering / Under-Replicated / Unavailable' 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-01-2019
	
		
		07:21 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks again this seemed to have fixed the issue I am going through most of the tables as it looks like a few of them are still trying to talk to the old UUID.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-01-2019
	
		
		12:45 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 You have mentioned that NTP is not related to the problem. Let's consider this scenario:  1. Impala Daemon is working with the READ_AT_SNAPSHOT setting enabled. Impala daemon makes a read operation in Kudu. It sets the read timestamp T1 immediately after the preceiding write operation.  2. Kudu despatches the read request to some replica R1. This replica R1 is running on a machine with poorly configured NTP, so the local time on this machine is 1 minute behind.  3. The replica R1 waits for the timeout specified by '--safe_time_max_lag_ms': 30 seconds. After the timeout, the local time is still 30 seconds behind T1 (ideally).  Does this lead to the problem under discussion: 'Tablet is lagging too much to be able to serve snapshot scan'? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-09-2019
	
		
		07:39 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 In terms of resources, five masters wouldn't strain the cluster much. The big change is that the state that lives on the master (e.g. catalog metadata, etc.) would need to be replicated with a replication factor of 5 in mind (i.e. at least 3 copies to be considered "written"). While this is possible, the recommended configuration is 3. It is the most well-tested and commonly-used. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		12-10-2018
	
		
		10:35 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							This has come up a few times on mailing lists and on the Apache Kudu slack, so I'll post here too; it's worth noting that if you want a single-partition table, you can omit the PARTITION BY clause entirely. See the "Note" here: https://www.cloudera.com/documentation/enterprise/latest/topics/kudu_impala.html#concept_g51_5vk_ft
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-22-2018
	
		
		11:39 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 That error generally means that there may be another Kudu node already running using that directory. Do you perhaps have both a Kudu master and tablet server running on that machine? If so, you'll need to change your configurations so your directories don't overlap. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-05-2018
	
		
		10:23 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 That is up to your workload and how much storage you need per node. It's common to see anywhere from 6 to 12 disks per tablet server. Check out the limitations documentation for some guidance there. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        







