We are on HDP 2.6.5 but would like to use a more recent version of Phoenix, without upgrading it cluster-wide.
HBase coprocessors can be dynamically deployed (for instance picking up the coprocessor jar from HDFS) against specific tables. We are wondering whether this would be a route to using a newer version of Phoenix against a set of tables? We are unclear if there would be unwanted side-effects.
I'd be really interested to know if anyone has attempted this with success or otherwise.
For the benefit of anyone else interested, this is the thread from Josh Elser on the Phoenix User mailing list:
There would be significant "unwanted side-effects". You would be taking on a very large burden trying to come up with a corresponding client version of Phoenix which would still work against the newer coprocessors that you are trying to deploy. Phoenix doesn't provide any guarantee of compatibility for more than a few versions between client and server.
Would suggest that you move to HDP 3.1.0 if you want a newer version of Phoenix.
Hey Josh, thanks for your thoughts.
Based on your advice we will almost certainly not pursue this direction. But just to clarify, in terms of the client version are you referring to the Query server, JDBC clients or both?
I imagine from the JDBC perspective that a client would only be accessing tables with the same Phoenix version. But it maybe that my take has a lot of erroneous assumptions in it, as I haven't looked at the internals of the JDBC driver code.
I was referring to the JDBC (thick) client and the coprocessors inside HBase. The thin JDBC client does not talk to HBase directly, only to PQS.
PQS and the thin-client would actually be the exception to what I said in my last message. You could (with a high degree of confidence) deploy a new version of Phoenix using an old thin JDBC driver. However, this isn't really any different than just upgrading Phoenix wholesale.
Yup, in short, you're glossing over quite a bit. One example: the (thick) JDBC must construct and send RPC messages to the appropriate RegionServers to execute certain operations. The deployed coprocessors in HBase must both know how to parse those RPC messages, but also interpret them correctly (e.g. an older CP might be able to parse a newer clients message, but could miss an important field that was added to that message).