Member since 
    
	
		
		
		07-27-2016
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                6
            
            
                Posts
            
        
                2
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		04-10-2018
	
		
		02:09 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 The complete steps to reset root password for sandbox is  1. start your VM  2. press f12 to go to grub menu  3.  press e  4. select the second line "kernel /vmlinuz-2.632 ...."  5. press again e  6.  add  s at end of line (<consoleblank=0 quiet s)and press enter.  7. press b which will log you in as root without password.  8. Change password using passwd. Be carrefull, you are in qwerty mode  9. press exit and press  entrer  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-28-2016
	
		
		02:58 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 As workaround, the following insrt solves the inssue  hive> set hive.optimize.ppd = false;  thnks @Jitendra Yada 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-28-2016
	
		
		11:21 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Hi Jitendra,  followed is  the complete processs and very explicite  pyspark program  >>>data = [('C1_1',None,'C3_1'), ('C1_2','C2_2',None),('C1_3',None,None),(None,None,'C3_4'),('C1_5','C2_5','C3_5')]   >>>df = sqlContext.createDataFrame(data, ['C1', 'C2','C3'])   >>>df.printSchema()   root    |-- C1: string (nullable = true)    |-- C2: string (nullable = true)    |-- C3: string (nullable = true)  >>>df.show()  
+---------+-------+-------+  
|  C1      |  C2     |  C3    |    +--------+-------+--------+   | C1_1  |  null  | C3_1 |   | C1_2  | C2_2|  null   |  
| C1_3|  null    |  null   |    | null   |  null   | C3_4  |   | C1_5| C2_5 | C3_5 |   +------+--------+-------+   >>>df.write.mode("overwrite").parquet('hdfs:/tmp/table1.parquet')   hdfs commands  $ hdfs dfs -ls /tmp/table1.parquet   Found 11 items   /tmp/table1.parquet/_SUCCESS
/tmp/table1.parquet/_common_metadata   /tmp/table1.parquet/_metadata   /tmp/table1.parquet/part-r-00000-3b1ecab2-1942-47eb-b393-c6988876d4a1.gz.parquet   /tmp/table1.parquet/part-r-00001-3b1ecab2-1942-47eb-b393-c6988876d4a1.gz.parquet   /tmp/table1.parquet/part-r-00002-3b1ecab2-1942-47eb-b393-c6988876d4a1.gz.parquet   /tmp/table1.parquet/part-r-00003-3b1ecab2-1942-47eb-b393-c6988876d4a1.gz.parquet   /tmp/table1.parquet/part-r-00004-3b1ecab2-1942-47eb-b393-c6988876d4a1.gz.parquet   /tmp/table1.parquet/part-r-00005-3b1ecab2-1942-47eb-b393-c6988876d4a1.gz.parquet   /tmp/table1.parquet/part-r-00006-3b1ecab2-1942-47eb-b393-c6988876d4a1.gz.parquet   /tmp/table1.parquet/part-r-00007-3b1ecab2-1942-47eb-b393-c6988876d4   Hive commands  hive>create external table table1 ( C1 string, C2 string, C3 string)
   stored as parquet 
   location 'hdfs:/tmp/table1.parquet';   OK
Time taken: 6.469 seconds   hive>describe table1;   OK   c1   string   c2   string   c3   string   Time taken: 1.455 seconds,   Fetched: 3 row(s)   hive> select * from table1;   OK   SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".   SLF4J: Defaulting to no-operation (NOP) logger implementation   SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.   C1_1    NULL    C3_1   C1_2    C2_2    NULL   C1_3    NULL    NULL   NULL    NULL C3_4   C1_5    C2_5    C3_5   Time taken: 0.355 seconds,   Fetched: 5 row(s)   hive> select * from table1 where C1='toto';   OK   Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Column [c1] was not found in schema!   Time taken: 0.277 seconds   hive> select * from table1 where c1='toto';   OK   Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Column [c1] was not found in schema!   Time taken: 0.096 seconds 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-28-2016
	
		
		10:11 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 yes, schema spark and schema hive are same.  also,  hive> describe table1  gives the three columns 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		07-28-2016
	
		
		10:00 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Followed is
the workflow  Hive>create external table if not exists table1 (
C1 string,
C2 string,
C3 int)    stored as parquet   location 'hdfs:/project/table1.parquet';  Parquet
data are created by spark as    df.write.mode("overwrite").parquet('hdfs:/project/table1.parquet')  Hive>select * from table1 where C1='toto';  OK  SLF4J:
Failed to load class "org.slf4j.impl.StaticLoggerBinder".  SLF4J:
Defaulting to no-operation (NOP) logger implementation  SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder
for further details.  Failed with exception java.io.IOException:java.lang.IllegalArgumentException:
Column [c1] was not found in schema!  Time taken: 0.254 seconds 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hive
 
        






