Created on 09-27-202011:14 PM - edited on 12-22-202011:25 PM by VidyaSargur
In this video, we'll review how to access data in S3 from the command line of a Data Hub cluster host using IDBroker. Some components in CDP work out of the box with IDBroker. However, most command-line tools like the Hadoop file system commands require a couple of additional steps to access data in S3. We'll demonstrate retrieving a keytab file for a workload user and using it to kinit on the Data Hub cluster host, enabling data access via IDBroker.
Many command-line tools in CDP Public Cloud Data Hub clusters require a Kerberos ticket granting ticket (TGT) for a workload user in order to obtain a short-term access token for S3 or ADLS Gen 2 via IDBroker.
This video demonstrates the following steps:
Granting a data access role to a workload user
Retrieving a keytab file for the workload user
Copying the keytab file to a host in the data hub cluster
Using the keytab file to kinit
Confirming the TGT using klist
Accessing data in S3 via IDBroker
It mentions, but does not demonstrate, retrieving a keytab file via the cdp command-line tool. Instructions for doing so are available in CDP documentation.