Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NOT IN operator in hive with a subquery

avatar
Expert Contributor

I have a subquery like this:

select SF.customer_email , SF.created_at
FROM 
Table1 SF
WHERE
YEAR(TO_DATE(SF.created_at)) = '2016'
AND 
MONTH(TO_DATE(SF.created_at)) = '6'
AND 
SF.customer_email 
NOT IN (
select SFO.customer_email FROM Table1 SFO 
WHERE
TO_DATE(SFO.created_at) < '2016-05-31'
)

I have checked manually and I should get results for the query but it returns empty resultset.

Note: I am using same table in the subquery as well. Just a different condition on date column.

1 ACCEPTED SOLUTION

avatar

Hi @Simran Kaur. This query 'should' work. 😉 Quick things to double-check...

1. These 2 queries return counts > 0

select count(*) 
FROM Table1 SF 
WHERE YEAR(TO_DATE(SF.created_at)) = '2016'
AND   MONTH(TO_DATE(SF.created_at)) = '6'
select count(*) FROM Table1 SFO 
WHERE TO_DATE(SFO.created_at) < '2016-05-31'

2. based on the counts returned - is it possible that there are no customers from June 2016 that did not receive an email prior to 2016 June? (i.e. the correct result is in fact zero)?

View solution in original post

1 REPLY 1

avatar

Hi @Simran Kaur. This query 'should' work. 😉 Quick things to double-check...

1. These 2 queries return counts > 0

select count(*) 
FROM Table1 SF 
WHERE YEAR(TO_DATE(SF.created_at)) = '2016'
AND   MONTH(TO_DATE(SF.created_at)) = '6'
select count(*) FROM Table1 SFO 
WHERE TO_DATE(SFO.created_at) < '2016-05-31'

2. based on the counts returned - is it possible that there are no customers from June 2016 that did not receive an email prior to 2016 June? (i.e. the correct result is in fact zero)?