Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Community Manager

In this video, we'll review how to access data in S3 from the command line of a Data Hub cluster host using IDBroker. Some components in CDP work out of the box with IDBroker. However, most command-line tools like the Hadoop file system commands require a couple of additional steps to access data in S3. We'll demonstrate retrieving a keytab file for a workload user and using it to kinit on the Data Hub cluster host, enabling data access via IDBroker.

 

Open the video on YouTube here

 

Many command-line tools in CDP Public Cloud Data Hub clusters require a Kerberos ticket granting ticket (TGT) for a workload user in order to obtain a short-term access token for S3 or ADLS Gen 2 via IDBroker. 

This video demonstrates the following steps: 

  • Granting a data access role to a workload user
  • Retrieving a keytab file for the workload user
  • Copying the keytab file to a host in the data hub cluster
  • Using the keytab file to kinit
  • Confirming the TGT using klist
  • Accessing data in S3 via IDBroker 

It mentions, but does not demonstrate, retrieving a keytab file via the cdp command-line tool. Instructions for doing so are available in CDP documentation.



732 Views
0 Kudos