New Contributor
Posts: 3
Registered: ‎01-17-2018

DISTRIBUTE BY and CLUSTER BY Not Supported in spark sql 1.6 cdh 5.7.0

 am using spark 1.6 and and trying to optimize my joins by following these blogs,%20DataFrames%20&%20Datasets... and using DISTRIBUTE BY and CLUSTER BY , but unfortunately they are not supported.

My spark sql query is


      """select b.*, count(*) AS CNT  from tableb b
         GROUP BY b.Key,b.KeyVal
         CLUSTER BY b.Key,b.KeyVal

Error is

Exception in thread "main" java.lang.RuntimeException: [5.7] failure: ``union'' expected but identifier CLUSTER found

      CLUSTER BY b.Key