Member since 
    
	
		
		
		08-05-2018
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                73
            
            
                Posts
            
        
                0
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		01-02-2024
	
		
		03:09 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 To save a DataFrame as a text file in PySpark, you need to convert it to an RDD first, or use DataFrame writer functions.   Using DataFrame writer:  df.write.format("text").save("path_to_output_directory")    Converting to RDD and then using saveAsTextFile  rdd = df.rdd.map(lambda row: str(row))  rdd.saveAsTextFile("path_to_output_directory")   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-15-2020
	
		
		12:07 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @shubh As this is an older post that has been marked solved in 2018. You would have a better chance of receiving a resolution by starting a new thread. This will also provide the opportunity to provide details specific to your environment that could aid others in providing a more accurate answer to your question.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-21-2018
	
		
		06:28 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Shu, thanks for responding.   The solution you provided appears a little difficult for something that I thought would be relatively simple.  I will try your solution and let you know how I get on.  In the meantime, have you seen the solution provided here:  https://forums.databricks.com/questions/2848/how-do-i-create-a-single-csv-file-from-multiple-pa.html?childToView=12091 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-19-2018
	
		
		10:13 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi guys, I'm sorry if the question seems a little confusing. Basically, I would just like to be able to save to a single file and the file to be overwritten each time it is saved.  Thanks 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		08-13-2018
	
		
		07:26 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Carlton Patterson Looks like you have accepted another comment. I've made this reply as comment and this should be the correct one to accept as it helped in resolving your issue. 🙂  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-02-2018
	
		
		09:58 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I assume this returns a limited result set, though, for large tables? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-14-2018
	
		
		07:26 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 This could be because you are parsing actual data in the place of header,supposing your first row has header and second row onwards has data.   Hence it can't parse data(int, string) as header(string).  So try changing it to ("skip.header.line.count"="1");   Hope this helps. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-01-2018
	
		
		04:26 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi rtrivedi,  I added the additional code as suggested, but I get the following error:   org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 2:0 cannot recognize input near 'set' 'hive' '.' in statement  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-28-2018
	
		
		03:41 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Take a look at this guide:   https://cwiki.apache.org/confluence/display/hive/languagemanual+dml#LanguageManualDML-Loadingfilesintotables  You should either try   INSERT INTO TABLE '${hiveconf:inputtable}' SELECT * FROM datafactory7 limit 14;  or  LOAD DATA INPATH '<HDFS PATH WHERE FILES LOCATED>' INTO TABLE ${hiveconf:inputtable}; 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-01-2018
	
		
		10:11 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi Jay, can you please let me know why I'm suddenly not able to access the Sandbox on port 2222? I was able before, but now I can't. 
						
					
					... View more