Member since
02-01-2022
5
Posts
2
Kudos Received
0
Solutions
02-14-2024
09:32 PM
1 Kudo
@PabitraDas Any thoughts ? Btw, when I just stop the ZK server from cloudera manager and attempt to start it from command line like this ...though I see the message from script as "STARTED" but the "start" attempt actually fails and I see nothing when I do a "ps -ef |grep -i zookeeper" . I see a "zookeeper.out" file was created on my current location and it shows the SASL error as seen below the attached screen shot It looks like the attempt to start zookeeper server is failing because of an SASL authentication failure as seen in the error snippet above .. This is how my jaas.conf looks like root@fqdn:/tmp/zk4mcli# cat jaas.conf Server { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="zookeeper.keytab" storeKey=true useTicketCache=false principal="zookeeper/FQDN@REALM"; }; QuorumServer { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="zookeeper.keytab" storeKey=true useTicketCache=false principal="zookeeper/FQDN@REALM"; }; QuorumLearner { com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true keyTab="zookeeper.keytab" storeKey=true useTicketCache=false principal="zookeeper/FQDN@REALM";
... View more
02-12-2024
06:30 AM
@PabitraDas :- Here's the steps that I followed today to convert one of zk node into an unmanaged node ...I am running into an issue described below .. 1) Stopped the zk server from cloudera manager and put the host under maintenance 2) updated the hosts file on all the 5 zk server nodes mapping the IP address and host fqdn 3) Backed up data and config files 4) from CM , decommissioned one of the host 5) stopped the cm agent on the node that was decommissioned 6) removed the host from CM 7) zoo.cfg file got updated with default values in this process , so restored the file from backup that had the zk quorum as follows ... server.1=zk1-fqdn:3181:4181 server.2=zk2-fqdn:3181:4181 server.3=zk3-fqdn:3181:4181 server.4=zk4-fqdn:3181:4181 server.5=zk5-fqdn:3181:4181 😎verified that myid file on all the host had unique id's so that there is no conflict and they can participate in the ensemble and the zk data was intact 9) started the ZK on the decommissioned node from command line and as seen in the attached screen shot , the startup event shows that it is reading the configuration from the zoo.cfg that has the quorum address Though the zookeeper server gets started from command line however when I run the following command to get the "Mode" of this zk server , instead of showing up as a "follower" . it says "standalone" . echo "stat" | nc zk3-fqdn 2181 | grep Mode Mode: standalone Please refer to the attached screen shot below that shows the nohup.out that shows zookeeper was successfully started and the corresponding zoo.cfg has the quorum details (similar to the one available on the other members of the quorum ) . It looks like , though the startup command reads the configuration from zoo.cfg file however it still fails to become the member of the ensemble and I am not sure why ?
... View more
02-08-2024
09:57 PM
Thanks for your reply , PabitraDas Let me add some more clarity here ... We have a CM that manages multiple CDH & CDP cluster . one of the cluster out of those is a standalone 5 node zookeeper cluster , that is not used by any services of any of the clusters managed by this CM . This standalone 5 node zookeeper cluster was created long time back using cloudera manager by an admin who is no longer working with us and I am assuming he may have created this 5 node zookeeper ensemble using CM just because it was easy to do that using CM instead of a CLI / manually We now want to decommission this 5 node zookeeper cluster completely and remove it from cloudera manager however we do not want to loose the zookeeper data . so the idea is to one by one decommission these 5 nodes and remove them from cloudera manager however we want zookeeper service to be re-installed using CLI on this nodes since the ensemble will still be used by an external app . so the question here is , after we decomm 1 zookeeper node (starting with a current follower first ) from the zookeeper ensemble of 5 node , is it going to be an issue that we will have temporarily 4 nodes still managed by cloudera manager and one not managed (since the ID of zookeeper in myid file will be same as before and we are hoping that after the unmanaged zk node comes backup , it will sync the data from the leader zk instance ) . We plan to keep this setup ( 4 managed and 1 unmanaged zk node of same ZK ensemble ) running for couple of days and if no issues are reported then proceed with decommissioning of the remaining 4 nodes one after the another . Once again thanks for taking out time to respond to my question , I genuinely appreciate it
... View more
02-01-2024
05:29 PM
1 Kudo
@Vidya I am still waiting for someone to respond to my query .
... View more
01-12-2024
02:35 AM
We would like to convert 3 nodes ( these nodes have only zookeeper roles deployed on them ) to unmanaged nodes by purging the cloudera software . These 3 nodes forming a zookeeper ensemble were created long time ago and are not used by any of our existing CDP cluster services but some external edge querying applications . we now need to take them off being managed by cloudera manager to avoid being included unnecessarily in the licensing cost . How should I go about completing this task ? I don't want to break the external application but there is no reason that we should have these nodes managed by CM . Do I have to rebuild the ensemble using the open source s/w from apache after purging the cm agent s/w ?
... View more
Labels:
- Labels:
-
Apache Zookeeper