<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Optimal way of defining  HBASE column family in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Optimal-way-of-defining-HBASE-column-family/m-p/149270#M28378</link>
    <description>&lt;P&gt;I have a use case
scenario to store data in HBASE table and I would like to understand the
optimal way of defining column family in HBASE table to reduce the number of
get calls.&lt;/P&gt;&lt;P&gt;The scenario is &lt;/P&gt;&lt;P&gt;I will get an account number and I need to retrieve the
customer detail and other account number associated to the customer. I’m
thinking of defining the row with rowkey as 
acct &amp;amp; customer  and  column family with the account detail. one
more row with customer id as row key and column family with array of account
details.&lt;/P&gt;&lt;P&gt; Ex :&lt;/P&gt;&lt;P&gt;Row          Rowkey                   column +cell&lt;/P&gt;&lt;P&gt;1               acct1|cust1              acct1 values&lt;/P&gt;&lt;P&gt;2               acct2| cust1             acct2 values&lt;/P&gt;&lt;P&gt;3               acct3|
cust1             acct3 values&lt;/P&gt;&lt;P&gt;4              cust1                        column family
with array of accounts[ acct1,acct2,acct3] &lt;/P&gt;&lt;P&gt;Please advise the optimal way of defining the datamodel  for this scenario.     &lt;/P&gt;</description>
    <pubDate>Mon, 16 May 2016 01:38:28 GMT</pubDate>
    <dc:creator>ss_seetharaman</dc:creator>
    <dc:date>2016-05-16T01:38:28Z</dc:date>
    <item>
      <title>Optimal way of defining  HBASE column family</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Optimal-way-of-defining-HBASE-column-family/m-p/149270#M28378</link>
      <description>&lt;P&gt;I have a use case
scenario to store data in HBASE table and I would like to understand the
optimal way of defining column family in HBASE table to reduce the number of
get calls.&lt;/P&gt;&lt;P&gt;The scenario is &lt;/P&gt;&lt;P&gt;I will get an account number and I need to retrieve the
customer detail and other account number associated to the customer. I’m
thinking of defining the row with rowkey as 
acct &amp;amp; customer  and  column family with the account detail. one
more row with customer id as row key and column family with array of account
details.&lt;/P&gt;&lt;P&gt; Ex :&lt;/P&gt;&lt;P&gt;Row          Rowkey                   column +cell&lt;/P&gt;&lt;P&gt;1               acct1|cust1              acct1 values&lt;/P&gt;&lt;P&gt;2               acct2| cust1             acct2 values&lt;/P&gt;&lt;P&gt;3               acct3|
cust1             acct3 values&lt;/P&gt;&lt;P&gt;4              cust1                        column family
with array of accounts[ acct1,acct2,acct3] &lt;/P&gt;&lt;P&gt;Please advise the optimal way of defining the datamodel  for this scenario.     &lt;/P&gt;</description>
      <pubDate>Mon, 16 May 2016 01:38:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Optimal-way-of-defining-HBASE-column-family/m-p/149270#M28378</guid>
      <dc:creator>ss_seetharaman</dc:creator>
      <dc:date>2016-05-16T01:38:28Z</dc:date>
    </item>
    <item>
      <title>Re: Optimal way of defining  HBASE column family</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Optimal-way-of-defining-HBASE-column-family/m-p/149271#M28379</link>
      <description>&lt;P&gt;Essentially column families should have the same keys. If you want to use two different keys you need two tables.&lt;/P&gt;&lt;P&gt;So I think you should have two tables, one keyed by account|cust  as you say to find the customer info for an account&lt;/P&gt;&lt;P&gt;and a separate table that is &lt;/P&gt;&lt;P&gt;cust|account so you can easily drill down to a customer and find all the accounts associated with it. You can also do the second table with cust as key and then an array of accounts as you say but then you always need to update the list of accoiunts at a time. If you key the second table by cust|account you can freely add delete account rows for a customer and do a scan to get all accounts. &lt;/P&gt;</description>
      <pubDate>Mon, 16 May 2016 16:53:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Optimal-way-of-defining-HBASE-column-family/m-p/149271#M28379</guid>
      <dc:creator>bleonhardi</dc:creator>
      <dc:date>2016-05-16T16:53:05Z</dc:date>
    </item>
  </channel>
</rss>

