Member since 
    
	
		
		
		08-15-2019
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                29
            
            
                Posts
            
        
                111
            
            
                Kudos Received
            
        
                5
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 1998 | 09-08-2017 11:30 PM | |
| 3185 | 06-08-2017 07:24 PM | |
| 8384 | 03-28-2017 05:20 PM | |
| 4459 | 03-17-2017 04:27 AM | |
| 4388 | 03-09-2017 11:48 PM | 
			
    
	
		
		
		09-08-2024
	
		
		10:36 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 With the Hive (newer than Hive 2.2),  you can use Merge INTO  MERGE INTO target_table AS target
USING source_table AS source
ON target.id = source.id
WHEN MATCHED THEN
    UPDATE SET
        target.name = source.name,
        target.age = source.age
WHEN NOT MATCHED THEN
    INSERT (id, name, age)
    VALUES (source.id, source.name, source.age);    
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-23-2020
	
		
		07:28 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Run msck command for the table you want to truncate in hive shell.  hive> use <database-name>;  hive> msck repair table <table-name>;  If it will show any error then rectify it as we got one of our partition was missing.     So we create that partition directory on hdfs location and re-run msck repair command. Now it would not show any issue.     Now running truncate command will run successfully.   hive> truncate table <table-name>;     [NOTE: Please update database and table name as per the requirement] 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-13-2018
	
		
		09:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		5 Kudos
		
	
				
		
	
		
					
							 For full stack of error message from hive you could look into hiveserver log like @Slim was mentioning. Since hive interactive is enabled I believe you should look into hsihiveserver log on the node where HiveServer2 Interactive is running 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		09-08-2017
	
		
		11:41 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 That explains it, thanks! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-23-2017
	
		
		06:50 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		4 Kudos
		
	
				
		
	
		
					
							 Consolidating below some of the errors thrown by Spark Thrift Server during SQL execution, that could be worked around by configuring certain parameters of spark-thrift-sparkconf.conf and hive-site.xml  
	Error 1: 
 Join condition is missing or trivial.Use the CROSS JOIN syntax to allow cartesian products between these relations.;
  
	Resolution: spark.sql.crossjoin.enabled: true  
	Error 2:  
 Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method "eval(Lorg/apache/spark/sql/catalyst/InternalRow;)Z" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate" grows beyond 64 KB
  
	Resolution: spark.sql.codegen.wholeStage : false  
	Error 3: 
 java.lang.OutOfMemoryError: Java heap space
  
	Resolution: spark.driver.memory : 10g <to a higher-value>  
	                   spark.sql.ui.retainedExecutions: 5 <to some lower-value>   
	Error 4:  org.apache.spark.SparkException: Exception thrown in awaitResult: (state=,code=0)  Resolution: hive.metastore.try.direct.sql: false (in hive-site.xml)  
	To enable heap dump collection for spark driver and executors, for debugging out of memory errors  
	spark.driver.extraJavaOptions: '-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<path-to-dump-file-location>'  
	spark.executor.extraJavaOptions: '-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<path-to-dump-file-location>' 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		07-13-2017
	
		
		10:39 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		2 Kudos
		
	
				
		
	
		
					
							 If the issue is that the number of rows is too high starting beeline with beeline --incremental=true will be of help 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-29-2017
	
		
		09:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thanks @Kshitij Badani, will configure a local user as I am about to try some spark queries 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		04-28-2017
	
		
		11:12 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		12 Kudos
		
	
				
		
	
		
					
							 
Hive is having an improvised explain plan with the hive.explain.user set to true. Now the indentation and arrows clearly depict the execution flow. Retrieving the join order needed a lot of back and forth references in the old style which was made as easy as following the arrows now, even with the details of mappers and reducers.  
Let us take a look at one sample explain plan and retrieve the join order 
 Plan optimized by CBO.
Vertex dependency in root stage
Map 2 <- Map 1 (BROADCAST_EDGE), Map 4 (BROADCAST_EDGE), Map 5 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)
Reducer 3 <- Map 2 (SIMPLE_EDGE)
Stage-0
  Fetch Operator
    limit:-1
    Stage-1
      Reducer 3 vectorized, llap
      File Output Operator [FS_101]
        Group By Operator [GBY_100] (rows=1 width=736)
          Output:["_col0","_col1","_col2","_col3"],aggregations:["min(VALUE._col0)","min(VALUE._col1)","min(VALUE._col2)","min(VALUE._col3)"]
        <-Map 2 [SIMPLE_EDGE] llap
          SHUFFLE [RS_45]
            Group By Operator [GBY_44] (rows=1 width=736)
              Output:["_col0","_col1","_col2","_col3"],aggregations:["min(_col1)","min(_col3)","min(_col14)","min(_col19)"]
              Map Join Operator [MAPJOIN_92] (rows=80573328660391072 width=114)
                Conds:MAPJOIN_91._col11, _col7=RS_41._col0, _col0(Inner),Output:["_col1","_col3","_col14","_col19"]
              <-Map 9 [BROADCAST_EDGE] llap
                BROADCAST [RS_41]
                  PartitionCols:_col0, _col0
                  Select Operator [SEL_23] (rows=1960358 width=104)
                    Output:["_col0","_col1"]
                    Filter Operator [FIL_86] (rows=1960358 width=104)
                      predicate:id is not null
                      TableScan [TS_21] (rows=1960358 width=104)
                        default@title,t,Tbl:COMPLETE,Col:NONE,Output:["id","title"]
              <-Map Join Operator [MAPJOIN_91] (rows=73248479012737488 width=125)
                  Conds:MAPJOIN_90._col10=RS_38._col0(Inner),Output:["_col1","_col3","_col7","_col11","_col14"]
                <-Map 8 [BROADCAST_EDGE] llap
                  BROADCAST [RS_38]
                    PartitionCols:_col0
                    Select Operator [SEL_20] (rows=2 width=40)
                      Output:["_col0"]
                      Filter Operator [FIL_85] (rows=2 width=40)
                        predicate:((UDFToString(role) = 'actress') and id is not null)
                        TableScan [TS_18] (rows=4 width=40)
                          default@role_type,rt,Tbl:COMPLETE,Col:NONE,Output:["id","role"]
                <-Map Join Operator [MAPJOIN_90] (rows=66589524931926816 width=138)
                    Conds:MAPJOIN_89._col0, _col6=RS_35._col0, _col0(Inner),Output:["_col1","_col3","_col7","_col10","_col11","_col14"]
                  <-Map 7 [BROADCAST_EDGE] llap
                    BROADCAST [RS_35]
                      PartitionCols:_col0, _col0
                      Select Operator [SEL_17] (rows=764741 width=105)
                        Output:["_col0","_col1"]
                        Filter Operator [FIL_84] (rows=764741 width=105)
                          predicate:((UDFToString(gender) = 'f') and (name like '%An%') and id is not null)
                          TableScan [TS_15] (rows=3058966 width=105)
                            default@name,n,Tbl:COMPLETE,Col:NONE,Output:["id","name","gender"]
                  <-Map Join Operator [MAPJOIN_89] (rows=60535930444216152 width=152)
                      Conds:MAPJOIN_88._col7, _col4=RS_32._col0, _col1(Inner),Output:["_col0","_col1","_col3","_col6","_col7","_col10","_col11"]
                    <-Map 6 [BROADCAST_EDGE] llap
                      BROADCAST [RS_32]
                        PartitionCols:_col0, _col1
                        Select Operator [SEL_14] (rows=11636372 width=8)
                          Output:["_col0","_col1"]
                          Filter Operator [FIL_83] (rows=11636372 width=8)
                            predicate:(movie_id is not null and company_id is not null)
                            TableScan [TS_12] (rows=11636372 width=8)
                              default@movie_companies,mc,Tbl:COMPLETE,Col:NONE,Output:["movie_id","company_id"]
                    <-Map Join Operator [MAPJOIN_88] (rows=55032662847395656 width=167)
                        Conds:MAPJOIN_87._col0, _col2=RS_29._col0, _col2(Inner),Output:["_col0","_col1","_col3","_col4","_col6","_col7","_col10"]
                      <-Map 4 [BROADCAST_EDGE] llap
                        BROADCAST [RS_29]
                          PartitionCols:_col0, _col2
                          Select Operator [SEL_8] (rows=6110483 width=116)
                            Output:["_col0","_col1","_col2","_col4"]
                            Filter Operator [FIL_81] (rows=6110483 width=116)
                              predicate:((note) IN ('(voice)', '(voice: Japanese version)', '(voice) (uncredited)', '(voice: English version)') and person_id is not null and person_role_id is not null and movie_id is not null and role_id is not null)
                              TableScan [TS_6] (rows=12220966 width=116)
                                default@cast_info,ci,Tbl:COMPLETE,Col:NONE,Output:["person_id","movie_id","person_role_id","note","role_id"]
                      <-Map Join Operator [MAPJOIN_87] (rows=50029692413268030 width=184)
                          Conds:(Inner),(Inner),Output:["_col0","_col1","_col2","_col3","_col4"]
                        <-Map 1 [BROADCAST_EDGE] llap
                          BROADCAST [RS_24]
                            Select Operator [SEL_2] (rows=701938 width=104)
                              Output:["_col0","_col1"]
                              Filter Operator [FIL_79] (rows=701938 width=104)
                                predicate:person_id is not null
                                TableScan [TS_0] (rows=701938 width=104)
                                  default@aka_name,an,Tbl:COMPLETE,Col:NONE,Output:["person_id","name"]
                        <-Map 5 [BROADCAST_EDGE] llap
                          BROADCAST [RS_26]
                            Select Operator [SEL_11] (rows=34365 width=259)
                              Output:["_col0"]
                              Filter Operator [FIL_82] (rows=34365 width=259)
                                predicate:((UDFToString(country_code) = '[us]') and id is not null)
                                TableScan [TS_9] (rows=68731 width=259)
                                  default@company_name,cn,Tbl:COMPLETE,Col:NONE,Output:["id","country_code"]
                        <-Select Operator [SEL_5] (rows=2074019 width=104)
                            Output:["_col0","_col1"]
                            Filter Operator [FIL_80] (rows=2074019 width=104)
                              predicate:id is not null
                              TableScan [TS_3] (rows=2074019 width=104)
                                default@char_name,chn,Tbl:COMPLETE,Col:NONE,Output:["id","name"]
  
1. Start reading from the bottom  
	2. Look for the keyword 'TableScan', the row below it will have the details of the tables being joined  
	3. Look for the keyword 'Join Operator', the indent of this line will tell us the tables that are part of this join  
4. This joined table group will form candidate for another join if the line starts with '<-'  
5. Similarly keep going to top to get the complete join order  
	The join order in the above example boils down to (title(t),(((movie_companies(mc),(cast_info(ci),(company_name(cn),aka_name(an),char_name(chn)))),name(n)),role_type(rt))) 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
	
					
			
		
	
	
	
	
				
		
	
	
			
    
	
		
		
		04-28-2017
	
		
		05:10 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		7 Kudos
		
	
				
		
	
		
					
							 Looks like the data was removed from hdfs but no drop partition query was executed on hive side. The data being displayed from 'show partitions <table_name>' command retrieves data from hive metastore db, which has the information on table partition but not the real data, which is stored in hdfs. About the data missing from hdfs, it might be worth checking the hdfs-audit.log to trace the lifecycle of the directory/file 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-22-2019
	
		
		10:32 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I am also facing the same issue in HDP3... Is it resolved...????  
						
					
					... View more