Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Where to store a really wide table?

Solved Go to solution

Where to store a really wide table?

Super Guru

If you have 600+ columns and you need to access 20-30 columns at a time. What is the optimal type of storage:

  • Hive Table Stored as ORC with compression, vectorization and optimization; access with Tez and properly partition and bucket
  • HBase
  • HBase in Phoenix Table
  • Parquet File
  • AVRO File
  • Accumulo
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Where to store a really wide table?

Super Guru

600 columns of detailed information on customers with many kinds of attributes. The data needs to be access interactively in reports and through web applications. Access to a few hundred/thousand rows (plus summary information) from the dataset based on known 20-30 column chunks of related information bundles.

For interactive exploration of the data and extraction of this lists to use elsewhere.

View solution in original post

5 REPLIES 5
Highlighted

Re: Where to store a really wide table?

Super Collaborator

You can choose hbase as storage.

HBase can easily handle hundreds of columns. Consider grouping the columns normally accessed together in the same column family.

Highlighted

Re: Where to store a really wide table?

Accumulo would work for the same reasons that HBase does.

Highlighted

Re: Where to store a really wide table?

Super Collaborator

If you can share your use case more, we would be able to provide more advice.

Highlighted

Re: Where to store a really wide table?

Super Guru

600 columns of detailed information on customers with many kinds of attributes. The data needs to be access interactively in reports and through web applications. Access to a few hundred/thousand rows (plus summary information) from the dataset based on known 20-30 column chunks of related information bundles.

For interactive exploration of the data and extraction of this lists to use elsewhere.

View solution in original post

Highlighted

Re: Where to store a really wide table?

Super Collaborator

HBase is a viable solution.

For query, consider Phoenix.

Don't have an account?
Coming from Hortonworks? Activate your account here