Member since 
    
	
		
		
		03-31-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                26
            
            
                Posts
            
        
                3
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		02-02-2023
	
		
		03:32 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @45, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-19-2019
	
		
		04:04 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Hi vaccarinicarlo,  In the hadoop world where different components may have different rules about  cases sensitivity, it may be best to do as Alex Behm said above: "It's just easier to accept one canonical casing".  I agree with you that it might be better to issue more warnings when anythign other than lower case is used. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		02-01-2019
	
		
		12:28 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @David_Schwab it was my understanding that when submitting a job with a keytab, the spark Application Master would periodically renew the ticket using the principal and keytab, as per:     https://www.cloudera.com/documentation/enterprise/5-15-x/topics/cm_sg_yarn_long_jobs.html     Could it be possible that the ticket refresh rate is longer than that of the maximum ticket life? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		01-29-2019
	
		
		09:13 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Note: The below process is easier if the node is a gateway node. The correct Spark version and the directories will be readily available for mounting to the docker container.        The quick and dirty way is to have an installation of Spark which matches your cluster's major version installed or mounted in the docker container.     As well, you will need to mount the yarn and hadoop configuration directories in the docker container. Mounting these will prevent you from needing to set a ton of config on submission. Eg:     "spark.hadoop.yarn.resourcemanager.hostname","XXX"     Often these both can be set to the same value:  /opt/cloudera/parcels/SPARK2/lib/spark2/conf/yarn-conf.     The SPARK_CONF_DIR, HADOOP_CONF_DIR and YARN_CONF_DIR environment variables need to reference be set if using spark-submit. If using SparkLauncher, they can be set like so:     val env = Map(    "HADOOP_CONF_DIR"  -> "/example/hadoop/path",    "YARN_CONF_DIR"    -> "/example/yarn/path"  )  val launcher = new SparkLauncher(env.asJava).setSparkHome("/path/to/mounted/spark")      If submitting to a kerberized cluster, the easiest way is to mount a keytab file and the /etc/krb5.conf file in the docker container. Set the principal and keytab using spark.yarn.principal and spark.yarn.keytab, respectively.        For ports, 8032 of the Spark Master's (Yarn ResourceManager External) definitely needs to be open to traffic from the docker node. I am not sure if this is the complete list of ports - could another user verify?          
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		06-19-2017
	
		
		09:38 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Went ahead and downloaded a fresh .jar and followed the steps in the guide posted above - got it working! Appreciate the help. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-31-2017
	
		
		02:49 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
	
		1 Kudo
		
	
				
		
	
		
					
							 Odd, hdfs dfs -ls etc. all seem to be working. As well, if I run through the "Getting Started" tutorial I don't seem to encounter any issues.     In regards to the IOException when attempting to write to disk (see below), is this just tied to the user behind Spark2 not having write priviledges to that location?     Error summary: IOException: Mkdirs failed to create file:/home/cloudera/Documents/hail-workspace/source/out.vds/rdd.parquet/_temporary/0/_temporary/attempt_201703311444_0001_m_000000_3 
						
					
					... View more