Must the computation and serving layer lie on the same instance?
If not, how do I tell the serving layer from which ip to take the data from?
Furthermore, can you please elaborate more on how to create multipe serving layers to improve scalability, and on how it improves performance if in fact they all have to lie on the same instance?