Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
What is Cloudera Data Warehouse?
Cloudera Data Warehouse is an auto-scaling, highly concurrent and cost effective analytics service that ingests high scale data anywhere, from structured, unstructured and edge sources. It supports hybrid and multi-cloud infrastructure models by seamlessly moving workloads between on-premises and any cloud for reports, dashboards, ad-hoc and advanced analytics, including AI, with consistent security and governance. Cloudera Data Warehouse offers zero query wait times, reduced IT costs and agile delivery. 
 
See more information here
 
Key Concepts:

In the Cloudera Data Warehouse service, your data is stored in an object store in a data lake that resides in your specific cloud environment. The service is composed of:

  • Database Catalogs: a metadata service associated to a CDP Data Lake which provides the data context for your defined tables and databases within the CDP Enterprise Data Cloud. 
  • Virtual Warehouses: compute resources running Hive or Impala on Kubernetes, which allow you to query data stored in cloud object store via the Database Catalogue. 
Please see Cloudera Documentation for further information. 
 
How do I monitor Virtual Warehouse usage?
Cloudera Data Warehouse environments come with a pre-built Grafana dashboard that lets you monitor usage of all Virtual Warehouses within that environment. To access the Grafana dashboard, you will need to access the Kubernetes pods and extract the password. 
 
Pre-requisites:
  • This article assumes you have already configured Cloudera Data Platform and the Cloudera Data Warehouse service with at least one Environment, at least one Database Catalogue and at least one Virtual Warehouse. Please see Cloudera's Getting Started Instructions.
  • Install kubectl command line interface or your favourite kubernetes UI or CLI

 

How To:

1. On the Cloudera Data Platform Home Page, open the Data Warehouse service:

unnamed-2.png

 

2. Expand the Environments menu:

unnamed-4.png
 
3. Click the hamburger menu on your desired Environment:
unnamed-5.png
 
4. Click Show Kubeconfig and copy the text to your clipboard:
unnamed-3.png
 
5. Paste the kubeconfig into a file and run the following command to access the kubernetes cluster for that Environment. This command will get the password which is stored encoded in base 64, decrypt it and copy the password to your clipboard:
 
vi dwx.config
kubectl --kubeconfig ~/dwx.config get secret grafana -n istio-system -o json | jq -r .data.passphrase | base64 -D | pbcopy
 
6. Go back to your Cloudera Data Warehouse environment and and click Open Grafana:
unnamed-6.png
 
7. You should see the Grafana login screen open on a new tab:
unnamed-7.png
Username = admin
Password = the password that is now on your clipboard
 
8. Once logged in, expand the istio menu and choose the Compute Autoscaling dashboard:
unnamed-8.png
 
The Compute Autoscaling dashboard will show you total node usage for your environment, as well as individual nodecounts for each of your Virtual Warehouses:
unnamed-9.png
2,882 Views
Comments

Hi agillan,

On executing the following command:

kubectl --kubeconfig ~/dwx.config get secret grafana -n istio-system -o json | jq -r .data.passphrase | base64 -D | pbcopy

 

I get the following response:

error: You must be logged in to the server (Unauthorized)

Hi agillan,

Sorry to bother you. I figured out the issue. I had to grant access to my iam user through the grant access option.

Thanks.