Created 10-11-2017 12:54 PM
Below query when executed in Hive CLI produces result but not through spark using Hivecontext.
val s = """select cast(date as String) as date_key,
field1,
field2,
field3,
field4,
COUNT(CASE WHEN m.field5 in ('Core','A') THEN a.strRXNUM END) AS result1,
COUNT(CASE WHEN m.field6 in ('Advanced','D') THEN a.strRXNUM END) AS resul2,
COUNT(CASE WHEN m.field7 in ('Custom Core','g') THEN a.strRXNUM END) AS result3,
COUNT(CASE WHEN m.field8 in ('Unassigned') THEN a.strRXNUM END) AS result4,
COUNT(a.strRXNUM) AS result4
from
$targetdb.table1 a left join $sourcedb.table2 m
ON
a.strTHERAPY_TYPE = m.strTherapyType
inner join
$sourcedb.table3 b
on
cast(a.dtshipment_date as string)=cast(b.thedate as string)
where a.date between '2015-01-01' and date_sub(current_date(),1)
group by
cast(date as String),
field1,
field2,
field3,
field4 """
val df=Hivectx.sql(s)
Created 10-11-2017 03:40 PM
What does Hivectx.sql(s).show() give?
Does Hivectx.sql('show tables').show() gives expected output?
How are you giving values to $targetdb,$sourcedb?
Created 10-14-2017 12:37 PM
Hive ctx..sql(s).show was not giving expected output i.e a blank table.
Passed values to source and target db through arguments with spark submit.
I was able to find the issue-Worked after I Converted dtshipment date and thedate both to timestamp for camparison.
While running the query in hive cli it was able to compare dtshipment_date(date type) with thedate(timestamp) and give the records on join but through spark sql it was not giving same output until I converted both to timestamp.