Created 10-08-2018 08:14 AM
Using a non AWS endpoints for S3a and thereby have a basic issue that hive is not honoring the s3a endpoint if its not AWS. While distcp, hadoop fs, Spark, MapReduce jobs are finding my s3a endpoint and got completed/ successful without any issues but HIVE is ignoring it and is expecting for AWS S3 credentials, as seen in the below example.
I tried three options and error was same with all the 3 options as shown below: ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.net.SocketTimeoutException: doesBucketExist on s3aTestBucket: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint INFO : Completed executing command(queryId=hive_20181007232623_f38e7fac-5aed-4d4a-b08a-9cbfc950d7a6); Time taken: 116.608 seconds Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.net.SocketTimeoutException: doesBucketExist on s3aTestBucket: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint (state=08S01,code=1)
Option 1: Ran the create database command as shown above but passing my s3 credentials using JCEKS in the HDFS core-site.xml as hadoop.security.credential.provider.path=jceks://hdfs@nile3-vm6.centera.lab.emc.com:8020/user/test/s3a.jceks Running a hive query 0: jdbc:hive2://nile3-vm7.centera.lab.emc.com> CREATE DATABASE IF NOT EXISTS table1 LOCATION 's3a://s3aTestBucket/user/table1'; INFO : Compiling command(queryId=hive_20181007232623_f38e7fac-5aed-4d4a-b08a-9cbfc950d7a6): CREATE DATABASE IF NOT EXISTS table1 LOCATION 's3a://s3aTestBucket/user/table1' INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20181007232623_f38e7fac-5aed-4d4a-b08a-9cbfc950d7a6); Time taken: 230.907 seconds INFO : Executing command(queryId=hive_20181007232623_f38e7fac-5aed-4d4a-b08a-9cbfc950d7a6): CREATE DATABASE IF NOT EXISTS table1 LOCATION 's3a://s3aTestBucket/user/table1' INFO : Starting task [Stage-0:DDL] in serial mode
Option 2: Passing User:s3-Key in URL while creating a databaseI am even tried the option of using CREATE DATABASE IF NOT EXISTS table1 LOCATION 's3a://s3-user:s3-secret-key@s3aTestBucket/user/table1'; but didn't work
Option 3: Even added the below propert on hive-site hive.security.authorization.sqlstd.confwhitelist.append=hive\.mapred\.supports\.subdirectories|fs\.s3a\.access\.key|fs\.s3a\.secret\.key
On hive shell from Ambari ran the following
set fs.s3a.access.key= s3-access-key; set fs.s3a.secret.key= s3-secret-key; CREATE DATABASE IF NOT EXISTS table1 LOCATION 's3a://s3aTestBucket/user/table1';
I saw a similar post from past but not sure if the issue is solved or not
Created 10-08-2018 08:36 AM
Above issue is observed cause of https://issues.apache.org/jira/browse/HIVE-20386. Refer bug for more details.
As a workaround you can try below methods :
Method 1 : Set below config in core-site.xml
fs.s3a.bucket.<bucket_name>.security.credential.provider.path = <jceks_file_path> #Replace <bucket_name> and <jceks_file_path> accordingly.
Method 2 : Set below configs in core-site.xml
fs.s3a.bucket.<bucket_name>.access.key = <s3a access key> fs.s3a.bucket.<bucket_name>.secret.key = <s3a secret key> #Replace <bucket_name> accordingly.
Let us know if the resolution works.
Created 10-08-2018 08:36 AM
Above issue is observed cause of https://issues.apache.org/jira/browse/HIVE-20386. Refer bug for more details.
As a workaround you can try below methods :
Method 1 : Set below config in core-site.xml
fs.s3a.bucket.<bucket_name>.security.credential.provider.path = <jceks_file_path> #Replace <bucket_name> and <jceks_file_path> accordingly.
Method 2 : Set below configs in core-site.xml
fs.s3a.bucket.<bucket_name>.access.key = <s3a access key> fs.s3a.bucket.<bucket_name>.secret.key = <s3a secret key> #Replace <bucket_name> accordingly.
Let us know if the resolution works.
Created 10-08-2018 03:38 PM
@Soumitra Sulav I tried Method1: i.e added
fs.s3a.bucket.s3aTestBucket.security.credential.provider.path=jceks://hdfs@nile3-vm6.centra.lab.test.com:8020/user/test/s3a.jceks
restarted HDFS on Ambari but seems like it didn't work. Any suggestion? Please find the logs below.
Didn't try Method2 as it can expose my credentials on Ambari UI
Logs:
0: jdbc:hive2://nile3-vm7.centra.lab.test.com> CREATE DATABASE IF NOT EXISTS table3 LOCATION 's3a://s3aTestBucket/user/table3';
INFO : Compiling command(queryId=hive_20181008105923_0324b26a-64b7-4c8f-91e3-635c62442173): CREATE DATABASE IF NOT EXISTS table3 LOCATION 's3a://s3aTestBucket/user/table3'
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling command(queryId=hive_20181008105923_0324b26a-64b7-4c8f-91e3-635c62442173); Time taken: 230.585 seconds
INFO : Executing command(queryId=hive_20181008105923_0324b26a-64b7-4c8f-91e3-635c62442173): CREATE DATABASE IF NOT EXISTS table3 LOCATION 's3a://s3aTestBucket/user/table3'
INFO : Starting task [Stage-0:DDL] in serial mode ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.net.SocketTimeoutException: doesBucketExist on s3aTestBucket: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint INFO : Completed executing command(queryId=hive_20181008105923_0324b26a-64b7-4c8f-91e3-635c62442173); Time taken: 115.487 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.net.SocketTimeoutException: doesBucketExist on s3aTestBucket: com.amazonaws.AmazonClientException: No AWS Credentials provided by BasicAWSCredentialsProvider EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider : com.amazonaws.SdkClientException: Unable to load credentials from service endpoint (state=08S01,code=1)
Created 10-09-2018 07:26 AM
I believe you are following proper commands to create the jceks file :
hadoop credential create fs.s3a.access.key -value <ACCESS_KEY> -provider jceks://hdfs@<namenode>/tmp/s3a.jceks hadoop credential create fs.s3a.secret.key -value <SECRET_KEY> -provider jceks://hdfs@<namenode>/tmp/s3a.jceks #Verify by running below command hadoop credential list -provider jceks://hdfs@<namenode>/tmp/s3a.jceks
Make sure the hive user can access the jceks file. [Check permissions and owners]
And then you are adding the mentioned configuration in Ambari UI > HDFS > Configs > Custom core-site
I was able to run hive jobs with same scenarios as yours [underlying storage was not AWS]
If still it doesn't work can you try once the Method 2. Just to make sure there isn't any other issue.
Created 10-09-2018 11:38 PM
@Soumitra Sulav Tried both Methods #1 and #2 today and have attached the logs below. Logs are same for both the methods. Now it seems like its not complaining about AWS but still its failing.
I verified the JCEKS operation by simply running a -ls command with the user and even what you have suggested above comment and it worked. Just to add the cluster is kerberized.
Logs:
0: jdbc:hive2://nile3-vm7.centera.lab.test.com> CREATE DATABASE IF NOT EXISTS datab LOCATION 's3a://s3aTestBucket/db1'; INFO : Compiling command(queryId=hive_20181009185046_b19ccbf4-1cfd-4148-96cd-e20a6fe45b1f): CREATE DATABASE IF NOT EXISTS datab LOCATION 's3a://s3aTestBucket/db1' INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null) INFO : Completed compiling command(queryId=hive_20181009185046_b19ccbf4-1cfd-4148-96cd-e20a6fe45b1f); Time taken: 0.055 seconds INFO : Executing command(queryId=hive_20181009185046_b19ccbf4-1cfd-4148-96cd-e20a6fe45b1f): CREATE DATABASE IF NOT EXISTS datab LOCATION 's3a://s3aTestBucket/db1' INFO : Starting task [Stage-0:DDL] in serial mode ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.reflect.UndeclaredThrowableException) INFO : Completed executing command(queryId=hive_20181009185046_b19ccbf4-1cfd-4148-96cd-e20a6fe45b1f); Time taken: 0.318 seconds Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:java.lang.reflect.UndeclaredThrowableException) (state=08S01,code=1)
Created 10-11-2018 07:18 AM
Can you please provide the complete stacktrace.
If you aren't sure where to find the logs refer link
The exception encountered by you has been reported in a secure cluster as yours.
Refer the solution provided - https://community.hortonworks.com/content/supportkb/151796/error-orgapachehadoopsecurityauthenticati...
Created 10-22-2018 08:57 PM
While getting logs from YARN resource manager on Web UI at 8088 port in a kerberized cluster, its failing with the authentication error (HTTP Error Code 401, Unauthorized access). I am using chrome and not sure how do I make my web UI to validate the kerberos ticket. Any suggestions.
Created 10-23-2018 07:42 AM
@Sahil Kaw You can follow these steps to get logs from UI :
OR easier way is to get it from the node/servers.
Just goto /var/log/hadoop-yarn/yarn/yarn*resourcemanager*log
Above file will be log rotated. You can find the relevant file which contains the error stack trace.
Created 10-23-2018 07:50 PM
@Soumitra Sulav Today I redeployed my HDP cluster and it seems to be working with both the methods that you have shared above. I am not sure why it wasn't working with previous set up and seems like as an intermittent issue. I would keep you posted incase I find it again. Thanks for all your help in this.
Created 10-25-2018 08:49 AM
Good to know. If the answer helped you, please upvote so that it can help others.