Community Articles

Find and share helpful community-sourced technical articles.
avatar
Contributor

Introduction

When working with CDP Public Cloud, there may be a need to access other AWS services from Apache Spark. One such scenario is to get a secret (like a database password) from the AWS Secrets Manager. One approach to this is to use AWS Access keys, but using long-term security credentials in a program may not be feasible or desirable. Within CDP,  you can obtain the temporary AWS credentials from ID Broker and then use the AWS Java SDK to access AWS secrets manager. 

Steps

Here are the steps to try this in a Spark shell first:

  • Ensure that the IAM role mapped to the user has 'Read' access to the AWS Secrets Manager.
  • To test out this program, you can launch a Spark shell using the following command:  
    spark-shell --master=yarn \
    --conf "spark.jars.packages=com.amazonaws:aws-java-sdk:1.11.984,org.scalaj:scalaj-http_2.11:0.3.15"
  • Run the commands in the file interactively to see the results.

 

// Change variables here
val id_broker_host = "ps-sandbox-aws-dl-idbroker0.ps-sandb.a465-9q4k.cloudera.site"
val secretName = "cde-cloudera-repo"
val region = "us-west-2"    

//Retreive credentials from ID Broker
import scalaj.http.{Http, HttpOptions}
import org.json4s.jackson.JsonMethods._
val id_broker_request = Http("https://"+id_broker_host+":8444/gateway/dt/knoxtoken/api/v1/token")
val id_broker_token = (parse(id_broker_request.asString) \ "access_token").values.toString
val auth_header = Map("Authorization" -> s"Bearer $id_broker_token", "cache-control" ->  "no-cache")
val id_broker_credentials_request = Http("https://"+id_broker_host+":8444/gateway/aws-cab/cab/api/v1/credentials").headers(auth_header)
val id_broker_credentials = parse(id_broker_credentials_request.asString) \\ "Credentials"
val aws_access_key = (id_broker_credentials \ "AccessKeyId").values.toString
val aws_secret_key = (id_broker_credentials \ "SecretAccessKey").values.toString
val aws_session_token = (id_broker_credentials \ "SessionToken").values.toString

// Use the retreived credentials 
import com.amazonaws.auth.BasicSessionCredentials
import com.amazonaws.auth.AWSStaticCredentialsProvider
val aws_session_credentials = new BasicSessionCredentials(aws_access_key, aws_secret_key, aws_session_token)
val aws_credentials = new AWSStaticCredentialsProvider(aws_session_credentials)

// Access Secrets Manager service using AWS Java SDK with the temporary credentials
import com.amazonaws.services.secretsmanager.AWSSecretsManager
import com.amazonaws.services.secretsmanager.AWSSecretsManagerClient
import com.amazonaws.services.secretsmanager.model._
val secretsmanager_client = AWSSecretsManagerClient.builder.withCredentials(aws_credentials).withRegion(region).build
val getSecretValueRequest = new GetSecretValueRequest().withSecretId(secretName)
val getSecretValueResult = secretsmanager_client.getSecretValue(getSecretValueRequest)
val secret = getSecretValueResult.getSecretString()
print(secret)

 

  • You can easily embed these steps in an Apache Spark Scala program to get the secret before creating the Spark session. Do not forget to include the dependent jars when submitting the job.
  • Download this code sample from here
    wget https://raw.githubusercontent.com/karthikeyanvijay/cdp-publiccloud/main/aws/scripts/getAWSCredentials.scala

Conclusion

This post provided an example to access AWS Secrets Manager from Apache Spark using the temporary credentials from ID Broker. The approach can also be used to access other AWS services as well.

-------------

Vijay Anand Karthikeyan

4,452 Views