Reply
Explorer
Posts: 15
Registered: ‎03-03-2017

Hive and impala results are differening

Dear All,

 

Below is my problem statement:

I am trying to create the external table through hive. I am getting problems when I query

1)When I query count(*) from Hive and impala results are differing .

2)When I query an column in Hive with isnull condition I am getting resultset 6 rows.

    In these 6 rows first 3 coulmns has data with values and from 4th column values are with NULL.

    But when I fire the same query in impala I am not getting any rowsl

 

What could be the isssue please help me. Below are the last lines in ddl query

 

ROW FORMAT DELIMITED FIELDS TERMINATED BY ' |' LOCATION '/x/f/w/'

 

Thanks and Regards,

Naveen Srikanth D

 

Cloudera Employee
Posts: 34
Registered: ‎08-16-2016

Re: Hive and impala results are differing

There is not information for us to provide any guidance on what could be wrong. Do you believe that the hive results are inaccurate or the impala results?

Could you provide the following items for us to look at?

1) full table definition in hive

2) Full or Sample data to help reproduce the issue.

3) Queries results from both Hive and impala for the sample data above.

Thanks

Explorer
Posts: 15
Registered: ‎03-03-2017

Re: Hive and impala results are differing

[ Edited ]

Dear Naveen,

 

Thanks for picking up the issue.

 

Below is the data 

$NToI_$NToJ,LTB,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL
$NWVa,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL
BENTU,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL
$NXTe_B,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL
$MoaR,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL
LER,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL

 

Query:

select * from tbl_name where c is null;

Problem is: Above query was running on 2 million records and returns attached file 6 records in hive and same query doesn't result in impala.

 

Now what I have done is have taken this resultset alone and created new external table . Now query resultset not able to fetch in hive and impala both.  Actual files are delimited with space

 

 

Below is the ddl of the query:

 

CREATE TABLE `working.tbl_test`(
`a` string,
`b` string,
`c` string,
`d` string,
`e` string,
`f` string,
`g` smallint,
`h` smallint,
`i` tinyint,
`j` string,
`k` string,
`l` string,
`m` float,
`n` float,
`o` float,
`p` float,
`q` int,
`r` smallint,
`s` float,
`t` tinyint)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/tmp/tbl_test';

 

I will be reachable over +919791081769 and +919008482114. We could also have webex call if it is feasible for you

 

THanks and Regards,

Naveen Srikanth D

Explorer
Posts: 15
Registered: ‎03-03-2017

Re: Hive and impala results are differing

In below mentioned post when I fire query like below, works fine 

 

select * from working.ddr2_raw_actual_test where dest_icao='NULL'

 

But with actual file it is not working. I am getting resultset in hive which is correct but same query not able to get in impala.

 

Thanks and Regards,

Naveen Srikanth D

Announcements