Member since
04-24-2018
3
Posts
0
Kudos Received
0
Solutions
06-06-2018
02:51 PM
1 Kudo
Hi @Thiago
Charchar, you can use the HBase REST service that comes by default in the package, you only have to start it - the init script is located under /usr/hdp/current/hbase-master/etc/rc.d/hbase-rest. These will be the endpoints offered: https://hbase.apache.org/1.1/apidocs/org/apache/hadoop/hbase/rest/package-summary.html You can start it on the HBase Master nodes (usually 2 of them) but if you'd need it to scale, I guess you can start it on as many nodes are required, it's just a Java app that offers the REST service and connects to HBase in the backend. You can also tune it a little bit, for example setting the number of threads (in Custom
hbase-site): hbase.rest.threads.max=200
hbase.rest.threads.min=10
... View more
04-25-2018
08:37 PM
1 Kudo
@Thiago Charchar HBase replication might not be the best approach to synchronize the data in the initial phase of migration. I would have recommended snapshots but since you are upgrading to a higher version, that may not work as well. So follows the multi-step approach to migrate your HBase data over. Bulk HBase export to HDFS (time-in-point recovery approach). Hadoop Distcp sequence files to remote cluster where HBase tables are already created. Setup Replication and let tables be current. Choose a Date-time, plan a stagged cut-over of Applications. A replication once you have the majority of your data copied over will put way less stress on your cluster bandwidth and you shall be easily able to take care of the migration with bandwidth available for other operations. As far as the migration of "Hive structures" is concerned, do you mean the metadata or the underlying data? If you are talking about underlying data, of course distcp is the best option available. For metadata migration, there are multiple options available and metastore mapping to new cluster is one of the options. Let know if this answer helped resolving your query.
... View more