Hadoop Setup :-8 node Cluster with CDH 5.9Hive Version :- Hive 1.1.0-cdh5.9.0Compiled by jenkins on Fri Oct 21 00:54:46 PDT 2016From source with checksum 9c5d0bee25fab27d28098c3080f8aedc
Impala Version :-Impala v2.7.0-cdh5.9.0 (4b4cf19) built on Fri Oct 21 01:07:22 PDT 2016
Issue:-I ran same query on Hive and Impala (through Hue, Screenshot attached)
SELECT (AVG(cost_of_liquidity_provision*risk_of_liquidity_provision)- AVG(cost_of_liquidity_provision)*AVG(risk_of_liquidity_provision))/ (1.00000000*STDDEV_POP(cost_of_liquidity_provision)*STDDEV_POP(risk_of_liquidity_provision)) AS corr_coeff
The table is in parquet format (non partitioned)
I got different output in two different runsHive :- 0.8465 (correct as verified by external application e.g. R)Impala :- 0.0636
Similar thing happened with another query too.
I have created a StackoverFlow Question and Cloudera Impala Issue [IMPALA-4841] for the same.