Member since 
    
	
		
		
		10-10-2017
	
	
	
	
	
	
	
	
	
	
	
	
	
	
			
      
                13
            
            
                Posts
            
        
                0
            
            
                Kudos Received
            
        
                0
            
            
                Solutions
            
        
			
    
	
		
		
		04-05-2019
	
		
		08:37 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Any insights on this? 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		03-29-2019
	
		
		06:26 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 I am using S3A committer(Staging directory committer) with my object Store (not AWS) where I am trying to upload a large-sized directory (50TB in space) using on a Hortonworks client(hadoop). It uses Map Reduce to upload the directory to my object store. S3A staging committer initiates various tasks to do MultiPart Upload(MPU) operations to the Object Store and they all are committed during the job_commit phase.      Problem and Question:      1. MapReduce logs as seen on hadoop client  In my case, all the task commit are completed successfully but during S3A committer job_commit phase `it fails` and the error I see is  `INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to the job history server  INFO mapreduce.Job: map 0% reduce 100%  INFO mapreduce.Job: Job job_1553199983818_0003 failed with state FAILED due to Job commit from a prior MRAppMaster attempt is potentially in progress. Preventing multiple commit executions` -> `Are these a warning only?` as none of my tasks failed on RM      2. S3A committer error logs during the job_commit phase and it started deleting the files. I couldn't understand why the job_commit failed from below logs  LOGS  ```  INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: OutputCommitter is org.apache.hadoop.fs.s3a.commit.staging.DirectoryStagingCommitter  INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Attempt num: 2 is last retry: true because a commit was started.  INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.job.event.JobEventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$NoopEventHandler  INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.jobhistory.EventType for class org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler  INFO [main] org.apache.hadoop.yarn.event.AsyncDispatcher: Registering class org.apache.hadoop.mapreduce.v2.app.rm.ContainerAllocator$EventType for class org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter  [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://xxxxxx:8020]  INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://xxxxxx:8020]  INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://xxxxxx:8020]  INFO [main] org.apache.hadoop.mapreduce.jobhistory.JobHistoryEventHandler: Emitting job history data to the timeline server is not enabled  INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Not attempting to recover. Recovery is not supported by class org.apache.hadoop.fs.s3a.commit.staging.DirectoryStagingCommitter. Use an OutputCommitter that supports recovery.  INFO [main] org.apache.hadoop.mapreduce.v2.jobhistory.JobHistoryUtils: Default file system [hdfs://xxxxxx:8020]  INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Previous history file is at hdfs://xxxxxx:8020/user/xxxxxx/.staging/job_1553199983818_0003/job_1553199983818_0003_1.jhist  INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Starting to clean up previous job's temporary files  INFO [main] org.apache.hadoop.fs.s3a.commit.AbstractS3ACommitter: Task committer attempt_1553199983818_0003_m_000000_0: aborting job job_1553199983818_0003 in state FAILED  INFO [main] org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter: Starting: Task committer attempt_1553199983818_0003_m_000000_0: aborting job in state job_1553199983818_0003  INFO [main] org.apache.hadoop.fs.s3a.commit.AbstractS3ACommitter: Task committer attempt_1553199983818_0003_m_000000_0: no pending commits to abort  INFO [main] org.apache.hadoop.fs.s3a.commit.staging.StagingCommitter: Task committer attempt_1553199983818_0003_m_000000_0: aborting job in state job_1553199983818_0003 : duration 0:00.007s  INFO [main] org.apache.hadoop.fs.s3a.commit.AbstractS3ACommitter: Starting: Cleanup job job_1553199983818_0003  INFO [main] org.apache.hadoop.fs.s3a.commit.AbstractS3ACommitter: Starting: Aborting all pending commits under s3a://xxxxxx/user/xxxxxx/30000  ```  The Yarn logs don't tell much why the S3 job_committer failed. Any insights what I can look into to figure this.  MRAppMaster code where the logs are coming from  ```  try {  0309 String user = UserGroupInformation.getCurrentUser().getShortUserName();  0310 Path stagingDir = MRApps.getStagingAreaDir(conf, user);  0311 FileSystem fs = getFileSystem(conf);  0312  0313 boolean stagingExists = fs.exists(stagingDir);  0314 Path startCommitFile = MRApps.getStartJobCommitFile(conf, user, jobId);  0315 boolean commitStarted = fs.exists(startCommitFile);  0316 Path endCommitSuccessFile = MRApps.getEndJobCommitSuccessFile(conf, user, jobId);  0317 boolean commitSuccess = fs.exists(endCommitSuccessFile);  0318 Path endCommitFailureFile = MRApps.getEndJobCommitFailureFile(conf, user, jobId);  0319 boolean commitFailure = fs.exists(endCommitFailureFile);  0320 if(!stagingExists) {  0321 isLastAMRetry = true;  0322 LOG.info("Attempt num: " + appAttemptID.getAttemptId() +  0323 " is last retry: " + isLastAMRetry +  0324 " because the staging dir doesn't exist.");  0325 errorHappenedShutDown = true;  0326 forcedState = JobStateInternal.ERROR;  0327 shutDownMessage = "Staging dir does not exist " + stagingDir;  0328 LOG.fatal(shutDownMessage);  0329 } else if (commitStarted) {  0330 //A commit was started so this is the last time, we just need to know  0331 // what result we will use to notify, and how we will unregister  0332 errorHappenedShutDown = true;  0333 isLastAMRetry = true;  0334 LOG.info("Attempt num: " + appAttemptID.getAttemptId() +  0335 " is last retry: " + isLastAMRetry +  0336 " because a commit was started.");  0337 copyHistory = true;  0338 if (commitSuccess) {  0339 shutDownMessage =  0340 "Job commit succeeded in a prior MRAppMaster attempt " +  0341 "before it crashed. Recovering.";  0342 forcedState = JobStateInternal.SUCCEEDED;  0343 } else if (commitFailure) {  0344 shutDownMessage =  0345 "Job commit failed in a prior MRAppMaster attempt " +  0346 "before it crashed. Not retrying.";  0347 forcedState = JobStateInternal.FAILED;  0348 } else {  0349 if (isCommitJobRepeatable()) {  0350 // cleanup previous half done commits if committer supports  0351 // repeatable job commit.  0352 errorHappenedShutDown = false;  0353 cleanupInterruptedCommit(conf, fs, startCommitFile);  0354 } else {  0355 //The commit is still pending, commit error  0356 shutDownMessage =  0357 "Job commit from a prior MRAppMaster attempt is " +  0358 "potentially in progress. Preventing multiple commit executions";  0359 forcedState = JobStateInternal.ERROR;  0360 }  0361 }  0362 }  0363 } catch (IOException e) {  0364 throw new YarnRuntimeException("Error while initializing", e);  0365 }  ``` 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
			
    
	
		
		
		10-23-2018
	
		
		07:50 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Soumitra Sulav Today I redeployed my HDP cluster and it seems to be working with both the methods that you have shared above. I am not sure why it wasn't working with previous set up and seems like as an intermittent issue. I would keep you posted incase I find it again. Thanks for all your help in this. 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-22-2018
	
		
		08:57 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							@Soumitra Sulav While getting logs from YARN resource manager on Web UI at 8088 port in a kerberized cluster, its failing with the authentication error (HTTP Error Code 401, Unauthorized access). I am using chrome and not sure how do I make my web UI to validate the kerberos ticket.  Any suggestions.  
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-09-2018
	
		
		11:38 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Soumitra Sulav Tried both Methods #1 and #2 today and have attached the logs below. Logs are same for both the methods. Now it seems like its not complaining about AWS but still its failing.     I verified the JCEKS operation by simply running a -ls command with the user and even what you have suggested above comment and it worked. Just to add the cluster is kerberized.   Logs:    0: jdbc:hive2://nile3-vm7.centera.lab.test.com> CREATE DATABASE IF NOT EXISTS datab LOCATION 's3a://s3aTestBucket/db1';
INFO  : Compiling command(queryId=hive_20181009185046_b19ccbf4-1cfd-4148-96cd-e20a6fe45b1f): CREATE DATABASE IF NOT EXISTS datab LOCATION 's3a://s3aTestBucket/db1'
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO  : Completed compiling command(queryId=hive_20181009185046_b19ccbf4-1cfd-4148-96cd-e20a6fe45b1f); Time taken: 0.055 seconds
INFO  : Executing command(queryId=hive_20181009185046_b19ccbf4-1cfd-4148-96cd-e20a6fe45b1f): CREATE DATABASE IF NOT EXISTS datab LOCATION 's3a://s3aTestBucket/db1'
INFO  : Starting task [Stage-0:DDL] in serial mode
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.reflect.UndeclaredThrowableException)
INFO  : Completed executing command(queryId=hive_20181009185046_b19ccbf4-1cfd-4148-96cd-e20a6fe45b1f); Time taken: 0.318 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.reflect.UndeclaredThrowableException) (state=08S01,code=1)
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-08-2018
	
		
		03:38 PM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 @Soumitra Sulav I tried Method1: i.e added   fs.s3a.bucket.s3aTestBucket.security.credential.provider.path=jceks://hdfs@nile3-vm6.centra.lab.test.com:8020/user/test/s3a.jceks  restarted HDFS on Ambari but seems like it didn't work. Any suggestion? Please find the logs below.    Didn't try Method2 as it can expose my credentials on Ambari UI  Logs:  0: jdbc:hive2://nile3-vm7.centra.lab.test.com> CREATE DATABASE IF NOT EXISTS table3 LOCATION 's3a://s3aTestBucket/user/table3';  INFO : Compiling command(queryId=hive_20181008105923_0324b26a-64b7-4c8f-91e3-635c62442173): CREATE DATABASE IF NOT EXISTS table3 LOCATION 's3a://s3aTestBucket/user/table3'  INFO : Semantic Analysis Completed (retrial = false)  INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)  INFO : Completed compiling command(queryId=hive_20181008105923_0324b26a-64b7-4c8f-91e3-635c62442173); Time taken: 230.585 seconds  INFO : Executing command(queryId=hive_20181008105923_0324b26a-64b7-4c8f-91e3-635c62442173): CREATE DATABASE IF NOT EXISTS table3 LOCATION 's3a://s3aTestBucket/user/table3'  INFO : Starting task [Stage-0:DDL] in serial mode ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.net.SocketTimeoutException: doesBucketExist on s3aTestBucket: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint INFO : Completed executing command(queryId=hive_20181008105923_0324b26a-64b7-4c8f-91e3-635c62442173); Time taken: 115.487 seconds  Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.net.SocketTimeoutException: doesBucketExist on s3aTestBucket: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint (state=08S01,code=1) 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
			
    
	
		
		
		10-08-2018
	
		
		08:14 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 Using a non AWS endpoints for S3a and thereby have a basic issue that hive is not honoring the s3a endpoint if its not AWS. While distcp, hadoop fs, Spark, MapReduce jobs are finding my s3a endpoint and got completed/ successful without any issues but HIVE is ignoring it and is expecting for AWS S3 credentials, as seen in the below example.
  I tried three options and error was same with all the 3 options as shown below:
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.net.SocketTimeoutException: doesBucketExist on s3aTestBucket: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint 
INFO : Completed executing command(queryId=hive_20181007232623_f38e7fac-5aed-4d4a-b08a-9cbfc950d7a6); Time taken: 116.608 seconds 
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.net.SocketTimeoutException: doesBucketExist on s3aTestBucket: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint (state=08S01,code=1)
  Option 1: Ran the create database command as shown above but passing my s3 credentials using JCEKS in the HDFS core-site.xml as
hadoop.security.credential.provider.path=jceks://hdfs@nile3-vm6.centera.lab.emc.com:8020/user/test/s3a.jceks
Running a hive query
0: jdbc:hive2://nile3-vm7.centera.lab.emc.com> CREATE DATABASE IF NOT EXISTS table1 LOCATION 's3a://s3aTestBucket/user/table1'; 
INFO : Compiling command(queryId=hive_20181007232623_f38e7fac-5aed-4d4a-b08a-9cbfc950d7a6): CREATE DATABASE IF NOT EXISTS table1 LOCATION 's3a://s3aTestBucket/user/table1' 
INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null) 
INFO : Completed compiling command(queryId=hive_20181007232623_f38e7fac-5aed-4d4a-b08a-9cbfc950d7a6); Time taken: 230.907 seconds 
INFO : Executing command(queryId=hive_20181007232623_f38e7fac-5aed-4d4a-b08a-9cbfc950d7a6): CREATE DATABASE IF NOT EXISTS table1 LOCATION 's3a://s3aTestBucket/user/table1' INFO : Starting task [Stage-0:DDL] in serial mode 
  Option 2: Passing User:s3-Key in URL while creating a databaseI am even tried the option of using CREATE DATABASE IF NOT EXISTS table1 LOCATION 's3a://s3-user:s3-secret-key@s3aTestBucket/user/table1'; but didn't work  Option 3: Even added the below propert on hive-site
hive.security.authorization.sqlstd.confwhitelist.append=hive\.mapred\.supports\.subdirectories|fs\.s3a\.access\.key|fs\.s3a\.secret\.key    On hive shell from Ambari ran the following  set fs.s3a.access.key= s3-access-key;
set fs.s3a.secret.key= s3-secret-key;
CREATE DATABASE IF NOT EXISTS table1 LOCATION 's3a://s3aTestBucket/user/table1';
  I saw a similar post from past but not sure if the issue is solved or not  Link  https://community.hortonworks.com/questions/71891/hdp-250-hive-doesnt-seem-to-honor-an-s3a-endpoint.html 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
- 
						
							
		
			Apache Hive
			
    
	
		
		
		10-06-2018
	
		
		01:32 AM
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		
	
				
		
			
					
				
		
	
		
					
							 [spark@vm1 spark2-client]$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi     --master yarn-client     --num-executors 1     --driver-memory 512m     --executor-memory 512m     --executor-cores 1     examples/jars/spark-examples_2.11-2.3.1.3.0.1.0-187.jar 10  Logs
Traceback (most recent call last):
  File "/usr/bin/hdp-select", line 448, in <module>
    listPackages(getPackages("all"))
  File "/usr/bin/hdp-select", line 266, in listPackages
    os.path.basename(os.path.dirname(os.readlink(linkname))))
  OSError: [Errno 22] Invalid argument: '/usr/hdp/current/oozie-client'
  ls: cannot access /usr/hdp//hadoop/lib: No such file or directory  
Exception in thread "main" java.lang.IllegalStateException: hdp.version is not set while running Spark under HDP, please set through HDP_VERSION in spark-env.sh or add a java-opts file in conf with -Dhdp.version=xxx
at org.apache.spark.launcher.Main.main(Main.java:118)
 
						
					
					... View more
				
			
			
			
			
			
			
			
			
			
		
		
			
				
						
							Labels:
						
						
		
			
	
					
			
		
	
	
	
	
				
		
	
	
- Labels:
- 
						
							
		
			Apache Hadoop
- 
						
							
		
			Apache Spark
 
        



