Member since 
    
	
		
		
		01-07-2020
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                64
            
            
                Posts
            
        
                1
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		11-03-2021
	
		
		03:43 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi @balajip      I know how to create a UDF. My problem is that every time I restart impala the udf is gone. Is there any way to keep UDF after the restart or I have to create it every time ? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-19-2021
	
		
		09:27 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @drgenious You didn't include a link to what you found that said "something about CDH", but I suspect based on your description that what you found was not about CDH (which stands for Cloudera's Distribution including Apache Hadoop), but CDC, or change data capture.  I will leave the question about how to copy the data from an RDBMS such as Mysql and somehow publish that to a Kafka topic to other members of the community to answer.       
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-12-2021
	
		
		01:54 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hello @drgenious,     Please check the below link [0].     [0]https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_metrics_impala_daemon_resource_pool.html#concept_gif_9en_yk 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-07-2021
	
		
		10:45 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @dr If it's a managed table, you could get its size from TABLE_PARAMS table:  e.g.  SELECT a.TBL_NAME AS `TABLE`, b.PARAM_VALUE AS `SIZE` from TABLE_PARAMS b INNER JOIN TBLS a where a.TBL_ID=b.TBL_ID and b.PARAM_KEY='totalSize';     You could change the you need it. But, if there are external tables, or the table stats are not generated regularly, then you might not get the correct data.     You could get the table size using HDFS file system commands as well:     hdfs dfs -du -s -h <path to the table location>     This will give you more accurate data. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-28-2021
	
		
		08:49 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 2019-09-18 08:44:10  2020-08-05 13:15:48  2020-08-05 13:24:00  2020-10-15 18:29:34  2020-09-09 09:35:04     This is already in asc order. check it's format year/mm/dd 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-24-2021
	
		
		08:15 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi @drgenious,  1) where can I run these kind of queries?  In CM -> Charts -> Chart Builder builder you can run tsquery. Refer to this link:  https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_dg_chart_time_series_data.html  2) where can I find the attributes like category and clusterName in cloudera?   In Chart Builder text bar, write an incomplete query like:  SELECT get_file_info_rate  Below the text bar there is Facets, click on More, select any Facets you want, for example you select clusterName, then you will see a the clusterName shows in the chart's title.  Then you can complete your tsquery: SELECT get_file_info_rate where clusterName=xxxxx     If you want to build impala related charts, suggest to firstly review the CM > Impala service > Charts Library, many charts are already there for common monitoring purpose. You can open any of the existing charts to learn how to construct the tsquery and then build your own charts.     Another very good place to learn is CM > Charts > Chart Builder, at right side you will see a "?" button, click on it you will see many examples and you could just cllick "try it".     Regards,  Will  If the answer helps, please accept as solution and click thumbs up. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-22-2021
	
		
		10:54 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 1. For total memory configured you can check  (impala daemon memory * a total number of demons ) , these values should be displayed on top of Impala admission control as well that this much if memory is allocated to the impala.     2. You can check other memory metrics from the cluster utilization report, please note that how much memory is consumed per pool feature is not currently captured in impala metrics.     a) Max Allocated     Peak Allocation Time – The time when Impala reserved the maximum amount of memory for queries.  Click the drop-down list next to the date and time and select View Impala Queries Running at the Time to see details about the queries.     Max Allocated – The maximum memory that was reserved by Impala for executing queries. If the percentage is high, consider increasing the number of hosts in the cluster.  Utilized at the Time – The amount of memory used by Impala for running queries at the time when maximum memory was reserved.  Click View Time Series Chart to view a chart of peak memory allocations.     Histogram of Allocated Memory at Peak Allocation Time – Distribution of memory reserved per Impala daemon for executing queries at the time Impala reserved the maximum memory. If some Impala daemons have reserved memory close to the configured limit, consider adding more physical memory to the hosts.     b)  Max Utilized  Peak Usage Time – The time when Impala used the maximum amount of memory for queries.  Click the drop-down list next to the date and time and select View Impala Queries Running at the Time to see details about the queries.     Max Utilized – The maximum memory that was used by Impala for executing queries. If the percentage is high, consider increasing the number of hosts in the cluster.  Reserved at the Time – The amount of memory reserved by Impala at the time when it was using the maximum memory for executing queries.  Click View Time Series Chart to view a chart of peak memory utilization.     Histogram of Utilized Memory at Peak Usage Time – Distribution of memory used per Impala daemon for executing queries at the time Impala used the maximum memory. If some Impala daemons are using memory close to the configured limit, consider adding more physical memory to the hosts.     [1] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_cluster_util_custom.html#concept_jp4_4bh_hx     [2] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_metrics_impala_daemon.html     [3] https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/cm_metrics_impala_daemon_resource_pool.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-23-2021
	
		
		04:07 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,     can you use beeline and type the below command then recreate the table :     set parquet.column.index.access=false;   this should make hive not use the index of your create table statement to map the data in your files, but instead it will use the columns names .  hope this works for you.     Best Regards   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-09-2021
	
		
		05:40 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi,     What is the query you are using to read the data from table? can you attach its "query profile" and coordinator logs to have a look?     Regards,  Chethan YM 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-18-2021
	
		
		08:37 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 All the hive related tables are stored under  "hive" database in mysql.     You can take mysql dump for a database hive and can prevent this from happening in the future.  You can use command like:     mysqldump -u root -p hive      Reference: https://www.sqlshack.com/how-to-backup-and-restore-mysql-databases-using-the-mysqldump-command/  
						
					
					... View more