Created on 04-23-2024 12:22 PM - edited 04-23-2024 12:34 PM
Hi all,
I'm looking to create a query to check and validate the math on certain fields from different tables. I've found examples where the SUM function was used- but I struggled putting it together, and am just trying to find the best way to do these calcuations.
Sample data:
Customer table:
Customer ID | Money Spent |
0001 | 500 |
0002 | 250 |
0003 | 100 |
Sales table:
Order ID | Amount |
5005 | 500 |
5006 | 250 |
5007 | 50 |
Check sales math work in progress query(the query doesn't seem to work, returns a "delta" column with a {null} result:
SELECT
CAST(SUM(Customer.Money Spent - Sales.Amount) AS INT) AS DELTA
The desired output that I'd like:
Check table
Rule Name | Delta |
Sales Check | 0 |
Sales Check | 0 |
Sales Check | 50 |
Thanks in advance,
Created 05-09-2024 02:50 PM
Hive can't understand (20) as negative value. You should explicitly mention -20 in column value.
ie.,
+----------------+---------------+---------------+---------------+---------------+
| test.currency | test.spend_a | test.spend_b | test.spend_c | test.spend_d |
+----------------+---------------+---------------+---------------+---------------+
| GBP | 50 | 25 | -20 | 5 |
| JPY | 50 | 25 | -20 | 5 |
| AUD | 50 | 25 | -20 | 5 |
+----------------+---------------+---------------+---------------+---------------+
select Currency, (spend_a + spend_b + spend_c + spend_d) as total_spend from test;
+-----------+--------------+
| currency | total_spend |
+-----------+--------------+
| GBP | 60.0 |
| JPY | 60.0 |
| AUD | 60.0 |
+-----------+--------------+
Created 04-25-2024 07:51 AM
If you want to join the tables, you should have primary-foreign key relation.
ie., sales table should have customer_id column or customer table should have order_id column to map rows between 2 tables. Otherwise this is not achievable.
Created 04-25-2024 10:24 AM
Understood- so okay, added Customer ID to Sales table;
Customer table:
Customer ID | Money Spent |
0001 | 500 |
0002 | 250 |
0003 | 100 |
Sales table:
Customer ID | Order ID | Amount |
0001 | 5005 | 500 |
0002 | 5006 | 250 |
0003 | 5007 | 50 |
Check sales math work in progress query(the query doesn't seem to work, returns a "delta" column with a {null} result:
SELECT
CAST(SUM(Customer.Money Spent - Sales.Amount) AS INT) AS DELTA
The desired output that I'd like:
Check table
Rule Name | Delta |
Sales Check | 0 |
Sales Check | 0 |
Sales Check | 50 |
Created on 04-26-2024 08:41 AM - edited 04-26-2024 08:46 AM
I dint see the full query in your description. Are you aggregating (SUM) delta column based on customer_id ?
I did the following simple repro without aggregates & able to get the results as expected.
create external table Customer(customer_id string, money_spent bigint) stored as textfile;
insert into customer values("0001", 500), ("0002", 250), ("0003", 100);
create external table sales(customer_id string, order_id string, amount bigint) stored as textfile;
insert into sales values("0001", "5005", 500), ("0002", "5006", 250), ("0003", "5007", 50);
select "Sales Check" as rule, (a.money_spent - b.amount) as delta from customer a join sales b on a.customer_id = b.customer_id;
+--------------+--------+
| rule | delta |
+--------------+--------+
| Sales Check | 0 |
| Sales Check | 0 |
| Sales Check | 50 |
+--------------+--------+
3 rows selected (15.186 seconds)
Let me know if this helps.
Created on 05-09-2024 12:02 PM - edited 05-09-2024 01:57 PM
Thanks @nramanaiah. I realized that I gave a bad example before. Apologies.
Table CustCurr:
Currency | Spend_A | Spend_B | Spend_C | Spend_D |
GBP | 50 | 25 | (20) | 5 |
JPY | 50 | 25 | (20) | 5 |
AUD | 50 | 25 | (20) | 5 |
The Output that I'd like to get(as a check- can't create temp\tables as i only have read access) is:
Currency | total_spend_check | A_and_C | B_C_D |
GBP | 80 | 30 | 10 |
JPY | 80 | 30 | 10 |
AUD | 80 | 30 | 10 |
The query that I'm trying is:
SELECT Currency, SUM(Spend_A + Spend_B, + Spend_C + Spend_D) AS total_spend_check
FROM
CustCurr
WHERE
DateofAction = '2024-05-01'
GROUP BY Currency
Here's the incorrect result that I get with above query:
currency | total_spend_check |
GBP | null |
JPY | null |
AUD | null |
I've tried coalesce, cast(field as int) with no avail.
Created 05-09-2024 02:50 PM
Hive can't understand (20) as negative value. You should explicitly mention -20 in column value.
ie.,
+----------------+---------------+---------------+---------------+---------------+
| test.currency | test.spend_a | test.spend_b | test.spend_c | test.spend_d |
+----------------+---------------+---------------+---------------+---------------+
| GBP | 50 | 25 | -20 | 5 |
| JPY | 50 | 25 | -20 | 5 |
| AUD | 50 | 25 | -20 | 5 |
+----------------+---------------+---------------+---------------+---------------+
select Currency, (spend_a + spend_b + spend_c + spend_d) as total_spend from test;
+-----------+--------------+
| currency | total_spend |
+-----------+--------------+
| GBP | 60.0 |
| JPY | 60.0 |
| AUD | 60.0 |
+-----------+--------------+
Created on 05-10-2024 10:39 AM - edited 05-10-2024 11:01 AM
Thanks! because i have null values in my data set as well, i used colaesce, and it worked! Your query was the basis though, so thanks again!
Query by @nramanaiah that worked for me, as I have null records in the dataset:
select Currency, (coalesce(spend_a,0)) + (colaesce(spend_b,0)) + coalesce(spend_c,0)) + coalesce(spend_d,0)) as total_spend from test