Code Repositories
Find and share code repositories
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (2)
Expert Contributor
Repo Description

This project demonstrates how to build, deploy and run a Spark Scala app that runs in a Kerberized cluster. There are scripts to create and populate an HBase table and to run the test. The README has more details on how to run it.

Notes:

  • hbase conf dir should be on SPARK_CLASSPATH
  • This approach does not work for long-running jobs (it needs to complete before the Kerberos token expires)
  • The example illustrates the use of the HBase InputFormat for obtaining an RDD. It also demonstrates using the HBase API for a Get operation for a row key.
  • The repo includes a maven project that will build a tar that contains a jar and scripts to help run the test in your cluster
  • The script (run_example.sh) uses 2 executors to demonstrate that the Kerberos token gets sent to the executors
Repo Info
Github Repo URL https://github.com/clukasikhw/kerberized-spark-hbase-hdp2.4-example
Github account name clukasikhw
Repo name kerberized-spark-hbase-hdp2.4-example
603 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
1 of 1
Last update:
‎11-17-2016 08:31 PM
Updated by:
 
Contributors
Top Kudoed Authors