03-13-2018 07:36 AM
i have a query which does several aggregations on different columns in a 24 rollup job, but when it does stddev_samp on decimal columns which contain both nulls and values it is giving out NaN values. Is there anyway we could ignore the NULL's and do a std deviation on only the numbers?
column1 has 24 values(from 24 different partitions, rollup). Of the 24 values if 20 has decimal values and 4 have NULL's, is there a way to consider only those 20 values and so a standard deviation on them instead of 24?
i mean if i need to calculate mean for this 20 values, it would be (sum of 20 values)/20
how could i achieve this in hive?
Thanks in advance.