Member since 
    
	
		
		
		04-08-2018
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                64
            
            
                Posts
            
        
                2
            
            
                Kudos Received
            
        
                2
            
            
                Solutions
            
        My Accepted Solutions
| Title | Views | Posted | 
|---|---|---|
| 10113 | 05-04-2018 05:01 PM | |
| 17147 | 04-16-2018 10:13 AM | 
			
    
	
		
		
		05-07-2018
	
		
		01:45 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I have just tested it. It worked fine! Thank you! 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-07-2018
	
		
		01:30 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Yes, sure. Sorry, I was actually referring to "hdfs://eureambarimaster1.local.eurecat.org:8020/user/hdfs/test/df.parquet"   Let me test it. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-07-2018
	
		
		01:05 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I think that this is the reason. If I login as HDFS user and run "hdfs dfs -chown -R centos /home/centos/test", then it says that this directory does not exist. I created this directory as HDFS user and then changed permissions to centos. Should I write a parquet file to the full path?:      df.coalesce(1).write.format("parquet").save("hdfs://eureambarimaster1.local.eurecat.org:8020/user/hdfs/test") 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-07-2018
	
		
		12:43 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Maybe the problem is that I run Spark program in Yarn cluster mode? It means that the driver can be running in any of the machines of the cluster. So, probably I should run "chown -R centos:centos ..." in each machine or do ".coalesce(1)"?  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-07-2018
	
		
		12:37 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 The output of "id":  uid=1000(centos) gid=1000(centos) groups=1000(centos),4(adm),10(wheel),190(systemd-journal)  I executed "chown -R centos:centos  /home/centos/test" but still get the same error:    18/05/07 12:06:28 ERROR ApplicationMaster: User class threw exception: org.apache.hadoop.security.AccessControlException: Permission denied: user=centos, access=WRITE, inode="/home/centos/test/df.parquet/_temporary/0":hdfs:hdfs:drwxr-xr-x  This is the output of "ls -la" executed in "/home/centos":  total 36236
drwx------.  4 centos centos     4096 May  7 12:34 .
drwxr-xr-x. 15 root   root       4096 Apr 16 18:41 ..
-rw-------.  1 centos centos    13781 May  7 11:26 .bash_history
-rw-r--r--.  1 centos centos       18 Mar  5  2015 .bash_logout
-rw-r--r--.  1 centos centos      193 Mar  5  2015 .bash_profile
-rw-r--r--.  1 centos centos      231 Mar  5  2015 .bashrc
-rw-rw-r--   1 centos centos       47 May  7 11:38 .scala_history
drwx------.  2 centos centos       46 May  2 07:57 .ssh
drwxrwxr-x   4 centos centos      144 May  7 11:42 test   
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-07-2018
	
		
		09:38 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I want to save DataFrame on disk:  df.write.format("parquet").save("/home/centos/test/df.parquet")  I get the following error, which says that the user "centos" does not have write permissions:  18/05/07 09:18:08 ERROR ApplicationMaster: User class threw exception: org.apache.hadoop.security.AccessControlException: Permission denied: user=centos, access=WRITE, inode="/home/centos/test/df.parquet/_temporary/0":hdfs:hdfs:drwxr-xr-x  This is how I run spark-submit command:  spark-submit  --master yarn  --deploy-mode cluster  --driver-memory 6g  --executor-cores 2  --num-executors 2  --executor-memory 4g  --class org.test.MyProcessor  mytest.jar 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
 - 
						
							
		
			Apache Hadoop
 - 
						
							
		
			Apache Spark
 
			
    
	
		
		
		05-07-2018
	
		
		08:45 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I re-submitted Spark job and now it works fine. The problem was that I submitted Spark job before changing permissions. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-07-2018
	
		
		08:34 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Yes, sure. Please see attached more screenshots from the RM UI. Thanks.         
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-07-2018
	
		
		06:42 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 However, the application with such Id exists in ResourceManager. Please see the attached screenshot.       
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		05-07-2018
	
		
		06:39 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Thank you. I did exactly what you suggested, but I still get the same error. The directory "app_logs/centos" has ownership: centos    hdfs:      18/05/07 06:36:36 INFO client.AHSProxy: Connecting to Application History server at eureambarislave1.local.eurecat.org/192.168.0.10:10200
File /app-logs/centos/logs-ifile/application_1525529485402_0020 does not exist.
File /app-logs/centos/logs/application_1525529485402_0020 does not exist.
Can not find any log file matching the pattern: [ALL] for the application: application_1525529485402_0020
Can not find the logs for the application: application_1525529485402_0020 with the appOwner: centos
 
						
					
					... View more