- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
accessing s3guard with hadoop cli
- Labels:
-
Apache Hadoop
-
Kerberos
Created 12-08-2020 10:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to learn more about s3guard and was attempting to follow along with some of the CLI examples in the CDP documentation. Any command I try results in a warning:
WARN impl.MetricsConfig: Cannot locate configuration: tried hadoop-metrics2-s3a-file-system.properties,hadoop-metrics2.properties
followed by an error:
java.lang.IllegalStateException: Authentication with IDBroker failed. Please ensure you have a Kerberos token by using kinit.
When I try running kinit I see:
Client 'cloudbreak@[FQDN]' not found in Kerberos database while getting initial credentials
Where "FQDN" corresponds to the VM I am ssh into. I've tried a couple machines in both my Data Hub and Data Lake clusters. Does anyone have insight on how to properly interact with my environment's s3guard setup?
Created 12-08-2020 01:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The reason why doing these operations as cloudbreak user fail is because this is a service user for accessing the cluster's machines only and performing admin tasks on them. this user does not have access to the data (no kerberos principal and no IDBroker mapping).
Instead, you can SSH to your cluster's EC2 machines with your username and workload password. That way you will have a kerberos principal working. Another thing to check is to make sure your user has IDBroker mapping to access S3 resources and potentially to access DynamoDB resources as well, since S3Guard relies on Dynamo.
Hope this helps,
Alex
Created 12-08-2020 01:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The reason why doing these operations as cloudbreak user fail is because this is a service user for accessing the cluster's machines only and performing admin tasks on them. this user does not have access to the data (no kerberos principal and no IDBroker mapping).
Instead, you can SSH to your cluster's EC2 machines with your username and workload password. That way you will have a kerberos principal working. Another thing to check is to make sure your user has IDBroker mapping to access S3 resources and potentially to access DynamoDB resources as well, since S3Guard relies on Dynamo.
Hope this helps,
Alex
Created 12-09-2020 08:13 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Those two things together did the trick. Added a proper IAM role to the IDbroker mapping and logged in with my workload user. Thanks for the helpful insight!
