This seems like a strange problem. I am using HiveQl within Ambari, which is very helpful for querying my dataset. It is time series data, but the date has already been parsed into three integers, Year, Month, Day. The queries work fine, aggregating and ordering the data nicely.
SELECT nuts1 as Region, count(distinct geodata.anonid) as Households, kwh_year as Year, kwh_month as Month, avg(kwh_day) as daily_KWH FROM kwh_daily inner join geodata on kwh_daily.anonid = geodata.anonid group by nuts1, kwh_year, kwh_month order by nuts1, kwh_year, kwh_month ;
However, when I use the advanced visualisation tools, month 12 is missing from all of my visualisations! Although month 12 shows up in the SELECT statement results, it never appears in the graphs. I have even tried isolating month 12 in a seperate SELECT and usig UNION to add to the rest. Again, thee query works, but the visualisation doesn't!
Any ideas much appreciated. Thaks, Aidan
Hi @Aidan Condron, just a wild guess, but I remember that some Java date functions keep months in the 0-11 range. If that's the case with one of your processing steps you are missing January not December! Though, "month" in Hive returns 1-12.
Thanks Predrag, but already tried this one! It's definitely 1-11, not 0-11. Plus 12 shows up in data table, just not visualisation. But thanks for suggestion!
I can select month 12 on it's own, and make a chart for that. In fact, that was my dirty fix - doctored a bmp of the 1 - 11 graph with paint to add the 12 graph! But still only shows up on it's own in Hive...