Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Avantages to have both data in Parquet and HBase

Highlighted

Avantages to have both data in Parquet and HBase

Explorer

When introducing Cloudera to my client, it tells me :

 

Why not having the data only inside HBase (so not in Parquet too) ? 

 

For now, I'm fail to find any arguments to keep the data also in Parquet files. So do you have advantages to store the data both in HBase and in Parquet ?

 

Is ETL with Spark, will have better performance with HDFS/Parquet or  from/to HBase ?

 

Thanks !

Don't have an account?
Coming from Hortonworks? Activate your account here