Created on 08-06-2024 08:29 AM - edited 08-06-2024 08:30 AM
Hello,
We are in process of adding new servers to existing CDP PB cluster. These servers will have master and worker roles along with Kafka brokers distributed / assigned to them. As part of the process, we will also decommission few servers which will be replaced by these new ones.
Questions are:
1. What will be the functional impact of this change on data engineers / analysts / developers of team?
2. What changes they will need to do in their codes / applications / integrations / connections, etc? I am aware of few like:
- Zookeeper ensemble will need to be modified
- Teams will need to be informed of new brokers
- Custom Knox topologies will need to be modified
What are the other changes that will need to be done by the user / engineers group?
3. I am aware that connection strings will need to be modified. Which ones, just Hive or any other as well?
Services that will be impacted of this change are:
HDFS
YARN
HIVE
IMPALA
KAFKA
HBASE
Also, we have Tableau and other applications that connect to cluster to fetch reports / data.
Kindly advise.
Thanks
Snm1523
Created 08-06-2024 11:17 PM
Apart from Zk ensemble, new broker details you need to have namespace updated for NN for RM nothing much will happen for Dev and Data engineer. Hive and Impala string will be changed.
Tableau connection string with Hive will also change.
Created 08-07-2024 01:31 AM
Thank you for the response @AyazHussain.
Possible to please reiterate "you need to have namespace updated for NN for RM"
Thanks
Snm1523
Created 08-07-2024 03:49 AM
You need to add the nameservice @snm1523 in most of the NN urls so that it wont go to one particular namenode. So from application side please check if any app is connecting to a particular NN mentioned hardcoded and not using nameservice
Created 08-07-2024 04:21 AM
Got it @AyazHussain. I was unclear with the statement "namespace updated for NN for RM". In our cluster we already have namespaces updated and also apps reach namespace instead of to NN directly. So that is okay.
Lastly, would you be able to comment on how and what precautions are needed while moving below roles from one server to another? Target is to decommission old server.
Atlas Server
HBase REST Server
HBase Thrift Server
HDFS Balancer
HDFS HttpFS
Hive on Tez HiveServer2
Hue Server
Hue Kerberos Ticket Renewer
Impala Daemon
Livy Server
As per my understanding, we will need to just add new hosts and assign them the relevant roles, however, few of these might also need data migration. Any comments on that?
Thanks
Snm1523
Created 08-11-2024 11:38 PM
Hi @snm1523 ,
Most of the services mentioned above doesnt need data migration.
For safer side just take the backup of the Hive backend DB. Hue backend db.
Created 08-13-2024 03:12 AM
So do you mean that there is no need of migration, just bring up the new server, assign required roles and then decomm the old one?
Created 08-12-2024 11:11 PM
Hi @snm1523 ,
If you are satisfied with the solution then click on the option "Accept as Solution".