Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

accessing s3guard with hadoop cli

avatar
Explorer

I'm trying to learn more about s3guard and was attempting to follow along with some of the CLI examples in the CDP documentation. Any command I try results in a warning:

WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties

followed by an error:

java.lang.IllegalStateException: Authentication with IDBroker failed.  Please ensure you have a Kerberos token by using kinit.

 When I try running kinit I see:

Client 'cloudbreak@[FQDN]' not found in Kerberos database while getting initial credentials

Where "FQDN" corresponds to the VM I am ssh into. I've tried a couple machines in both my Data Hub and Data Lake clusters. Does anyone have insight on how to properly interact with my environment's s3guard setup?

1 ACCEPTED SOLUTION

avatar
Master Collaborator

The reason why doing these operations as cloudbreak user fail is because this is a service user for accessing the cluster's machines only and performing admin tasks on them. this user does not have access to the data (no kerberos principal and no IDBroker mapping). 

 

Instead, you can SSH to your cluster's EC2 machines with your username and workload password. That way you will have a kerberos principal working. Another thing to check is to make sure your user has IDBroker mapping to access S3 resources and potentially to access DynamoDB resources as well, since S3Guard relies on Dynamo. 

 

Hope this helps,

Alex

View solution in original post

2 REPLIES 2

avatar
Master Collaborator

The reason why doing these operations as cloudbreak user fail is because this is a service user for accessing the cluster's machines only and performing admin tasks on them. this user does not have access to the data (no kerberos principal and no IDBroker mapping). 

 

Instead, you can SSH to your cluster's EC2 machines with your username and workload password. That way you will have a kerberos principal working. Another thing to check is to make sure your user has IDBroker mapping to access S3 resources and potentially to access DynamoDB resources as well, since S3Guard relies on Dynamo. 

 

Hope this helps,

Alex

avatar
Explorer

Those two things together did the trick. Added a proper IAM role to the IDbroker mapping and logged in with my workload user. Thanks for the helpful insight!