Community Articles

Find and share helpful community-sourced technical articles.
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
Community Manager

In this video, we'll review how to access data in S3 from the command line of a Data Hub cluster host using IDBroker. Some components in CDP work out of the box with IDBroker. However, most command-line tools like the Hadoop file system commands require a couple of additional steps to access data in S3. We'll demonstrate retrieving a keytab file for a workload user and using it to kinit on the Data Hub cluster host, enabling data access via IDBroker.


Open the video on YouTube here


Many command-line tools in CDP Public Cloud Data Hub clusters require a Kerberos ticket granting ticket (TGT) for a workload user in order to obtain a short-term access token for S3 or ADLS Gen 2 via IDBroker. 

This video demonstrates the following steps: 

  • Granting a data access role to a workload user
  • Retrieving a keytab file for the workload user
  • Copying the keytab file to a host in the data hub cluster
  • Using the keytab file to kinit
  • Confirming the TGT using klist
  • Accessing data in S3 via IDBroker 

It mentions, but does not demonstrate, retrieving a keytab file via the cdp command-line tool. Instructions for doing so are available in CDP documentation.

0 Kudos