Created 06-11-2016 02:43 PM
If you have 600+ columns and you need to access 20-30 columns at a time. What is the optimal type of storage:
Created 06-11-2016 04:02 PM
600 columns of detailed information on customers with many kinds of attributes. The data needs to be access interactively in reports and through web applications. Access to a few hundred/thousand rows (plus summary information) from the dataset based on known 20-30 column chunks of related information bundles.
For interactive exploration of the data and extraction of this lists to use elsewhere.
Created 06-11-2016 03:11 PM
You can choose hbase as storage.
HBase can easily handle hundreds of columns. Consider grouping the columns normally accessed together in the same column family.
Created 06-13-2016 02:22 PM
Accumulo would work for the same reasons that HBase does.
Created 06-11-2016 03:11 PM
If you can share your use case more, we would be able to provide more advice.
Created 06-11-2016 04:02 PM
600 columns of detailed information on customers with many kinds of attributes. The data needs to be access interactively in reports and through web applications. Access to a few hundred/thousand rows (plus summary information) from the dataset based on known 20-30 column chunks of related information bundles.
For interactive exploration of the data and extraction of this lists to use elsewhere.
Created 06-12-2016 01:42 AM
HBase is a viable solution.
For query, consider Phoenix.