Created on 09-02-2016 08:18 AM - edited 09-16-2022 03:37 AM
I have an AVRO schema which works and querries fine on HIVE, but when we query the same on Impala it thorws an error saying
"Your query has the following error(s):
Could not connect to <HOST NAME>:21050"
Which is weird and i dont see any log information on Hue as i am able to browse other tables that are defined on TEXT format within same database.
we are on impala-2.1.3+cdh5.3.3+0
One thing which looks wired on my schema when i decsribe it on Impala is it has the first row as NULL like below and i am not sure why!
My AVRO schema file is defined like below:
Its just sample and all the fields are defined as STRING
PS: Not sure why my pics have these wired color.
Created 09-12-2016 05:20 PM
Thanks for following up!
I'm pretty sure your table shoud work on more recent versions of Impala since we've fixed several Avro issues related to how schemas are defined.
As a workaround, you could try the following things:
1. In your .avsc file make all fields nullable by specifying the types a a union of null and the type like this:
type:["null", "int"]
2. Also specify corresponding matching column definitions in your CREATE TABLE, i.e.
CREATE TABLE MI_FULL (col1 INT, col2 STRING, )
ROW FORMAT SERDE
(the rest is exactly the same as before)
Let me know if you have questions and whether those workarounds helped!
Created 10-07-2016 06:12 PM
I agree completely. The workaround is not great. Dealing with Avro schemas in a more sensible fashion is definitely on our radar, but I cannot promise an ETA yet.