Member since 
    
	
		
		
		10-28-2020
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                622
            
            
                Posts
            
        
                47
            
            
                Kudos Received
            
        
                40
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 2189 | 02-17-2025 06:54 AM | |
| 6922 | 07-23-2024 11:49 PM | |
| 1418 | 05-28-2024 11:06 AM | |
| 1976 | 05-05-2024 01:27 PM | |
| 1303 | 05-05-2024 01:09 PM | 
			
    
	
		
		
		04-11-2024
	
		
		05:48 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 JDBC/ODBC Drivers for Hive can be downloaded from Cloudera website.  The first thing we need to collect is the Hive Endpoint from Cloudera Management console. This can be found at the bottom of in the specific DataHub window. It will be in the following format:     jdbc:hive2://datahub1-master0.geo-1035.lskx-pvue.a4.cloudera.site/;ssl=true;transportMode=http;httpPath=datahub1/cdp-proxy-api/hive     Majorly we need to furnish the following information in the appropriate fields:   Host(s) : datahub1-master0.geo-1035.lskx-pvue.a4.cloudera.site  Port : 443  Authentication Mechanism : Username/Password  Thrift Transport : HTTP  Go to HTTP Options :    HTTP Path : datahub1/cdp-proxy-api/hive  (you will get this info from the Hive Endpoint)      Go to SSL Options:    6.1. Check Enable SSL  6.2. Check Allow Self-signed Server Certificate check box  6.3. Trusted Certificates: Select the path to the PEM file containing the root ca cert of the Knox gateway.    Note: You can download the TLS Public Certificate from Data Hub > Token Integration > TLS Public Certificate > Download the PEM file.   Save and Test Connection. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-28-2024
	
		
		09:51 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @hegdemahendra You may try Cloudera Hive JDBC Driver. The driver class name would be "com.cloudera.hive.jdbc.HS2Driver". 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-20-2024
	
		
		03:54 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 @Choolake See if this does the job for you.  ...
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.JsonSerde'
WITH SERDEPROPERTIES (
    'separatorChar' = '|',
    'quoteChar' = '"'
)
STORED AS TEXTFILE LOCATION ....  This is a third party serde. You may download it from https://code.google.com/archive/p/hive-json-serde/downloads 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-20-2024
	
		
		02:59 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @ZainK This can happen due to various reasons such as resource constraints, or errors in the code. Do check those aspects.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-17-2024
	
		
		09:40 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Hadoop16 Was it working before? Did anything change from Kerberos point of view? Try regenerating the hive keytab file and see if it helps. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-14-2024
	
		
		11:58 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hive does use stats from an external table in preparing query plan. When stats are accurate, it could estimate the size of intermediate data sets and select efficient join strategies. The only thing I noticed is the fetch task is not working. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-14-2024
	
		
		06:29 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Try this:  CREATE external TABLE mytable (
    col1 INT,
    col2 STRING,
    col3 STRING,
    col4 STRING,
    col5 INT,
    col6 STRING,
    col7 STRING
    ...
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
    "separatorChar" = ",",
    "quoteChar"     = "\""
)
STORED AS TEXTFILE;  It should work. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-12-2024
	
		
		11:56 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @yashwanth It seems like you want to separate columns based on the position of comma.  In that case, you may create the table as follows:  CREATE TABLE my_table (
  col1 STRING,
  col2 INT,
  ...
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',' ... 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-12-2024
	
		
		07:46 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Leopold It is disabled for external tables as data in HDFS can change without Hive knowing about it.  Unfortunately I do not see a way to enforce fetch task for a query with an aggregate function.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-12-2024
	
		
		04:00 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 @Leopold I just checked. Your observation is correct. For external tables, it does not use a fetch task. In the logs, I see the following message:  2024-03-12 10:48:37,247 INFO  org.apache.hadoop.hive.ql.optimizer.StatsOptimizer: [b226e7aa-9a42-4af3-b99b-be4a6592fb7f HiveServer2-Handler-Pool: Thread-31145]: Table t7 is external. Skip StatsOptimizer.  But enabling  "hive.fetch.task.aggr=true" will help avoid the Reducer phase that is used for final aggregation. It will be a Map-only job. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		 
        













