Community Articles

Find and share helpful community-sourced technical articles.
avatar

This article contains Questions & Answers on Cloudera Operational Database (COD).

 

What is the relationship between Data Hub & Data Lake?

Data Lake houses SDX (governance & authorization). Data Hub is the actual service that hosts the workload, in this case, the Operational DB.

 

When should I use Cloudera OpDB vs a template in Data Hub?

For new apps use Cloudera OpDB (COD) that is self-tuning and auto-improves performance over time. And for replicating on-prem environments to the cloud via lift and shift or for disaster recovery use Data Hub templates.

 

How does Apache Phoenix relate to Apache HBase?

Phoenix is an OLTP SQL engine for OpDB. It adds relational capabilities on top of HBase. Phoenix provides a much more familiar programming paradigm and allows our customers to reach production faster. Think of Phoenix as a SQL persona and HBase as a NoSQL persona.

 

Should I use HBase/Phoenix or Apache Kudu for an operational data store (ODS) / Operational database?

The Cloudera Operational Database is powered by HBase and Phoenix Kudu is part of our data warehouse offering that allows you to do real-time analytics on streaming data. Just like you have to make a choice between an operational database and a data warehouse based on what you want to do, you similarly need to decide between OpDB & Kudu.  Both systems support real-time ingest of streaming and time-series data. OpDB is the platform you want to use if you are building applications.  If you are building dashboards or doing ad-hoc analytics, then Kudu will be a better choice.

 

What’s the frequency for replication? What’s the granularity?

Replication with Replication Manager for OpDB in near-real-time and is eventually consistent. There’s no waiting period and no scheduling required. And granularity is an option, you can choose a table or a namespace (akin to a more traditional DB).

1,597 Views