Support Questions
Find answers, ask questions, and share your expertise

Hive, Lateral view and nulls return no rows

Hive, Lateral view and nulls return no rows

Expert Contributor

Since the upgrade from hdp 3.1.0 to 3.1.4, I have some issue in Hive I do not understand. Note that I am only using ORC transactional tables.

For instance this (simplified) query:





with cte as (
    , c.json
  from event e
  join contact as c on
  , id
  , lv.customfield
from cte
lateral view outer
  json_tuple(cte.json, 'customfield') cv AS `customfield`





It worked perfectly before the upgrade.


Now, even if the CTE returns a certain number of rows, using the lateral view will just drop rows from

the resultset, without any error, whereas there is no extra where clause outside the CTE (in my real example, the query returns 66 rows without the lateral view, but only 19 with).


Another extremely surprising thing is that inside the CTE there is an if statement, for instance: `if(contact.is_deleted is null, 'true', 'false')`.
If I replace the `is null` with `is not distinct from null`, which should be perfectly valid, no rows are returned by the CTE.


I tried quite a few variations:

  • extract one row from the contact table to create a second table with only one contact (`create table contact2 as select * from contact where id=42`). I get the exact same behaviour, so I ruled out corrupted data
  • if I replace the event table by a static CTE (`select stack(1, ...)`) I have the result I expect
  • if I remove the lateral view, I have the number of rows I expect (as long as I do not use is distinct from)
  • if instead of a CTE I create and use a temporary table, the outcome does not change.

I am completely at loss and I have no idea why this happens and how I can trust hive. 

I cannot replicate the error by generating manual data so I cannot give a (not) working example.