I am doing some evaluation of HDC and I am looking to spin up EDW-Analytics:Apache Hive 2 LLAP, Apache Zepplin 0.7.0 in HDP 2.6 (Cloud)
I would like to know the difference in configuration between worker and compute nodes. The reason I ask is that I want to take advantage of spot pricing and I am not that concerned if I loose the nodes during my testing phase. However I would like to understand if these nodes are configured differently.
Worker nodes and compute nodes contain the same services. The basic advantage of compute nodes is, that if you want to use spot priced instances than you don't have to be afraid of losing any data because those nodes are only for compute purposes. You can also shrink down your compute group to 0 instance after the creation of the cluster.