Greetings dear community people,
I am about to prepare a small development cluster in the cloud and I am researching if I can manage it (with some automatisation scripts) to shutdown the cluster over night for cutting some costs in the cloud.
Does this make sense to you and can it be done?
I wanted to size the cluster to have: 3 Master Nodes, 4 Worker Nodes, WebApi Node and Access/Postgress node.
The applications to be developed are going to use: Yarn, Spark, Kudu, Hue, Impala.
I have read that after a shutdown the Kudu needs to repair the filesystem and it sometimes lasts a long time?
Could I shutdown some nodes every day over night and automatically reboot them in the morning without having some issues?
Thanks in advance.
I am evaulating at the momen a few cloud providers and have not decided which one.
Is it possible in practice do shutdown Kudu/Impala/Spark nodes with scripts overnight and start them automatically in the morning?
Thanks for the fast feedback. You tell me that this can be done without any big issues.
Could you maybe point me and answer some questions:
1. Can I write a scripte with a Cron JOB that can simulate something like this: for x in `cd /etc/init.d ; ls hadoop-*` ; do sudo service $x stop ; done and then start the same (let us say stop after 17:00 and start at 06:00) but without any errors in starting. I also need to shutdown the physical hosts and start them (virtual machines)
2. Where I see the issues is with the Kudu Tablet Master/Slave Servers. At least one needs to be running to preserve the data and I am interrested how long does the KUDU Filesystem check needs to run after HOST shutdown.
3. Are there any Github Code examples or guideliness to pursue these actions?
Kind regards and thanks in advance.
You should look into Cloudera Manager Rest API, you can call api to stop all services and let CM handle order properly instead of iterating through services.