Created on 01-31-202101:07 PM - edited on 02-02-202109:05 PM by subratadas
Recently, the weekly number of COVID-19 cases in The Netherlands has been steadily dropping week over week. However, underneath this lies a hidden positive trend of cases with the British COVID variant.
In this article, I explain how I made my first visual in Cloudera Data Visualization with just a few clicks.
Step 1: The Data
The data is created by combining several sources that shed light on the percentage of new infections that are made up with this variant, as well as the total number of cases. As we are here for tech, more than science, I will only highlight the official report of cases per week and an official article that contains the latest percentage, as well as an earlier percentage. Additional explanation on the numbers is given in the post below this article.
From here I could have chosen from many data sources, including a CSV file, but for reproducibility, I decided to upload it to Hive with a simple query:
create table covid_cases as
SELECT 431 as british_cases, DATE '2020-12-08' AS week_end_date
UNION SELECT 1168 , DATE '2020-12-15'
UNION SELECT 2470 , DATE '2020-12-22'
UNION SELECT 3369 , DATE '2020-12-29'
UNION SELECT 4854 , DATE '2021-01-05'
UNION SELECT 5434 , DATE '2021-01-12'
UNION SELECT 7755 , DATE '2021-01-19'
UNION SELECT 11760 , DATE '2021-01-26'
UNION SELECT 19085 , DATE '2021-02-02'
People have different opinions on the best way to mock up data, but if I only need a few rows, I always like to build this kind of query with the Excel concat function.
Step 2: The Connection
As I used Cloudera Data Visualization within a Data Warehouse, the connection to that Data Warehouse is available out of the box. As such, I only needed to select the database and table.
Step 3: The Visualization
In order to minimize the effort, I decided to stick to the default settings where possible. This has the additional benefit that it is very easy to reproduce what I have done.
Create a Dashboard
Add a visualization for the table
Select type: Bars
Y-axis: british_cases (it automatically understands that we want the sum)
X-axis: week_end_date (it already recognizes that it is a date)
Change week_end_date in X-axis type to timestamp
Labels: british_cases (it automatically understands that we want the sum)
Give your X-axis and Y-axis a nice alias, and add a title and subtitle to the chart
Now your chart should look just like the picture on top of this article.
Hopefully, this enables everyone to gain more insight into how COVID develops in The Netherlands and of course in how to visualize data with just a few clicks.
This additional data source indicates the cases for the week ending on 2nd Feb: