Hi all,
I have a query that I've found here thanks to @mszurap which I've tweaked and am trying to get working. I have a field called "RunNumber" that identifies if\when data is appended- it can have a value of 1 or 2 usually, but is not limited to 2. I can't use Group By, as you'll see in my query- there are hundreds and thousands of CASE statements. I've tried and cannot seem to get it working in Hive. I get a "..Expression not in GROUP BY key '0'..."
Query:
SELECT MaxRun = MAX(RunNumber)
,FirstName = SUM(CASE WHEN FirstName IS NULL OR FirstName = 'NULL' OR FirstName = ' ' THEN 1 ELSE 0 END) AS FirstName
,LastName = SUM(CASE WHEN LastName IS NULL OR LastName = 'NULL' OR LastName = ' ' THEN 1 ELSE 0 END) AS LastName
,State = SUM(CASE WHEN State IS NULL OR State = 'NULL' OR State = ' ' THEN 1 ELSE 0 END) AS State
FROM Table.Customers
WHERE DateOfRecord = '2023-05-01'
The output that I'd like to get:
MaxRun | FirstName | LastName | State |
1 | 500 | 750 | 1000 |