Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Move zookeeper datadir and datalogdir to new disk

avatar
Contributor

I am looking for guidance on moving the zookeeper (datadir and datalogdir) to a dedicated disk as per the best practice. I am unable to find a documentation which can help me. Currently the datadir and datalogdir disk is shared with other processes.

 

Any assistance is appreciated.

1 ACCEPTED SOLUTION

avatar
Contributor

Let's say your dataDir and old dataLogDir is /var/lib/zookeeper and now you're moving dataLogDir to /var/lib/zookeeper-log. First you change this in the service-wide configuration, which will make the stale configuration icon appear. Then you stop zk1, ssh into zk1 and run the following commands:

 

$ mkdir -p /var/lib/zookeeper-log/version-2

$ cp /var/lib/zookeeper/version-2/log.* /var/lib/zookeeper-log/version-2/

$ chown -R zookeeper:zookeeper /var/lib/zookeeper-log

 

Then you can start zk1 and wait until it's running and shows as either leader or follower in the Cloudera Manager service page too. After that's done, you can do the same with zk2 and finally with zk3 too. By this point the stale configuration alert should disappear and everything should be fine cluster-wide.

 

As you said, the log.* files need to be copied only.

View solution in original post

3 REPLIES 3

avatar
Mentor
The move by itself would be as trivial as doing an mv/cp across the new disk, while also ensuring the permissions stay intact.

In terms of using dedicated disk, the more important requirement is that of the dataLogDir (than the dataDir). ZK calls fsync on the logs written into the dataLogDir which can end up blocking for a long time when there are other processes sharing the disk.

You can and should keep the dataDir (where snapshots get stored) separate from the dataLogDir. This way large snapshot writes don't affect the transaction logging performance either. The dataDir location can be on a shared disk as its write is not synchronous.

Does this help?

avatar
Contributor

Thank you for the reply. In the current situation, the datadir and datalogdir are in the same location. We have the below files in the dir.

 

acceptedEpoch
currentEpoch
log.xxxxxx
snapshot.xxxx

 

If I have to move the datalogdir only, I just need to copy the log.xxx files to the new location in all Zookeeper servers, update the directory in the configuration and restart the zookeeper instances only. Could you please confirm if it is correct?

 

The Epoch files and snapshot.xxx belongs in the datadir correct?

avatar
Contributor

Let's say your dataDir and old dataLogDir is /var/lib/zookeeper and now you're moving dataLogDir to /var/lib/zookeeper-log. First you change this in the service-wide configuration, which will make the stale configuration icon appear. Then you stop zk1, ssh into zk1 and run the following commands:

 

$ mkdir -p /var/lib/zookeeper-log/version-2

$ cp /var/lib/zookeeper/version-2/log.* /var/lib/zookeeper-log/version-2/

$ chown -R zookeeper:zookeeper /var/lib/zookeeper-log

 

Then you can start zk1 and wait until it's running and shows as either leader or follower in the Cloudera Manager service page too. After that's done, you can do the same with zk2 and finally with zk3 too. By this point the stale configuration alert should disappear and everything should be fine cluster-wide.

 

As you said, the log.* files need to be copied only.