Created on 01-08-202104:28 AM - edited on 01-11-202101:05 AM by subratadas
COD - CDE using Phoenix
In this article, we will walk through steps required to be followed to read/ write using Phoenix JDBC to COD (Cloudera Operational Database) from Spark on CDE (Cloudera Data Engineering).
Assumption
COD is already provisioned and database cod-db is created. Refer to this link for the same.
CDE is already provisioned and virtual cluster is already created. Refer to this link for the same.
COD
Configuration
Spark in CDE to be able to talk to COD, it would require the hbase-site.xml config of the COD cluster. Do the following steps for retrieve the same:
Go to the COD control plane UI and click on cod-db database.
Under the "Connect" tab of COD database, look for HBase Client Configuration URL field. Following is the screenshot for the same. The configuration can be downloaded using the following curl command.
Make sure to provide the "workload" password for above curl call.
Explore the downloaded zip file to obtain the hbase-site.xml file.
Note down the Phoenix JDBC url from the Phoenix (Thick) tab.
Create table in COD
We would need to create table in COD using Phoenix for this demo. In order to do the same, please login into the gateway node of the COD cluster and run phoenix-sqlline command to create table as follows.
CREATE TABLE OUTPUT_TEST_TABLE (id BIGINT NOT NULL PRIMARY KEY, col1 VARCHAR, col2 INTEGER);
The node details can be obtained from the datahub hardware tab from control plane UI. Following is the screenshot for the same.
Build phoenix-spark project
Build the following Spark phoenix demo maven project.
Upload the demo app jar that was built earlier. cde resource upload --name odx-spark-resource --local-path ./spark-hbase/target/spark-hbase-1.0-SNAPSHOT.jar --resource-path spark-hbase-1.0-SNAPSHOT.jar
Create the CDE job using the following json and import command.