Member since 
    
	
		
		
		10-08-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                3
            
            
                Posts
            
        
                0
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		08-14-2022
	
		
		05:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 My OS is windows 11 and Apache Spark version is spark-3.1.3-bin-hadoop3.2  I try to use spark structured streaming with pyspark. Belows are my simple spark structured streaming codes.        spark = SparkSession.builder.master("local[*]").appName(appName).getOrCreate()
spark.sparkContext.setCheckpointDir("/C:/tmp")        The same spark codes without spark.sparkContext.setCheckpointDir line throws no errors on Ubuntu 22.04. However the above codes do not work successfully on windows 11. The execeptions are        pyspark.sql.utils.IllegalArgumentException: Pathname /C:/tmp/67b1f386-1e71-4407-9713-fa749059191f from C:/tmp/67b1f386-1e71-4407-9713-fa749059191f is not a valid DFS filename.        I think the error codes mean checkpoint directory are generated on hadoop file system of linux os , not on windows 11. My operating system is windows and checkpoint directory shoud be windows 11 local directory. How can I configure apache spark checkpoint with windows 11 local directory? I used file:///C:/temp and hdfs://C:/temp URL for test. But the errors are still thrown.  Any reply will be thanksful. Best regards 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Spark
 
        


