Support Questions

Find answers, ask questions, and share your expertise

Impala AVRO schema throws error while Querying

avatar
Contributor

I have an AVRO schema which works and querries fine on HIVE, but when we query the same on Impala it thorws an error saying 

"Your query has the following error(s):

Could not connect to <HOST NAME>:21050"

 

Which is weird and i dont see any log information on Hue as i am able to browse other tables that are defined on TEXT format within same database.

 

we are on impala-2.1.3+cdh5.3.3+0

 

One thing which looks wired on my schema when i decsribe it on Impala is it has the first row as NULL like below and i am not sure why!

 

Impala1.jpeg

 

 

My AVRO schema file is defined like below:

Its just sample and all the fields are defined as STRING 

 

{ "namespace": "HDFS", "name": "sample", "type": "record", "fields":[{"name":"recipient_identification_number", "type":"string"}, {"name":"validation_digit", "type":"string"}, {"name":"eligibility_status_code", "type":"string"}, {"name":"recipient_last_name", "type":"string"}, {"name":"recipient_first_name", "type":"string"}, {"name":"recipient_middle_name", "type":"string"}, {"name":"recipient_name_appel", "type":"string"}, {"name":"recipient_dob", "type":"string"}, {"name":"recipient_ssn", "type":"string"}, {"name":"recipient_ssn_prefix", "type":"string"}, {"name":"recipient_race_code", "type":"string"},
Imapa.jpeg


PS: Not sure why my pics have these wired color.
1 ACCEPTED SOLUTION

avatar

Thanks for following up!

 

I'm pretty sure your table shoud work on more recent versions of Impala since we've fixed several Avro issues related to how schemas are defined.

 

As a workaround, you could try the following things:

1. In your .avsc file make all fields nullable by specifying the types a a union of null and the type like this:

type:["null", "int"]

2. Also specify corresponding matching column definitions in your CREATE TABLE, i.e.

 

CREATE TABLE MI_FULL (col1 INT, col2 STRING, )

ROW FORMAT SERDE

(the rest is exactly the same as before)

 

Let me know if you have questions and whether those workarounds helped!

View solution in original post

10 REPLIES 10

avatar
Cloudera Employee

I would suggest looking in the log directory to see if you see any crash information there in impalad.INFO or impalad.FATAL. If so, can you please share them ?

avatar
Contributor

I did kwho, i dont see any entries on the files you have listed.

avatar

It looks like your table metadata is in a strange state. How did you create the table exactly? Did you alter the table (e.g. add/remove columns)?

 

avatar
Contributor

Sorry for replying late.

 

Alex,

 

I dint alter the table or columns, this is how i created the table.

 

CREATE TABLE MI_FULL
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
WITH SERDEPROPERTIES
('avro.schema.url'='hdfs://path/filename.avsc')
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES
('avro.schema.url'=hdfs://path/filename.avsc)
;

 

And then did an insert into this table, please let me know if there is something wrong with creating like this. 

avatar

Thanks for following up!

 

I'm pretty sure your table shoud work on more recent versions of Impala since we've fixed several Avro issues related to how schemas are defined.

 

As a workaround, you could try the following things:

1. In your .avsc file make all fields nullable by specifying the types a a union of null and the type like this:

type:["null", "int"]

2. Also specify corresponding matching column definitions in your CREATE TABLE, i.e.

 

CREATE TABLE MI_FULL (col1 INT, col2 STRING, )

ROW FORMAT SERDE

(the rest is exactly the same as before)

 

Let me know if you have questions and whether those workarounds helped!

avatar
Contributor

I am trying out the options you have suggested alex, i should have my results mostly by today.

avatar
Contributor

Voila its works fine they way you asked me to define the tables to get to view the data on Impala.

Thanks much Alex.

 

 

 

 

avatar

Thanks for following up and confirming that it works!

avatar
Contributor

But Just want to make a note that it defeates the purpose of using AVRO schema as for any schema changes we will have to make changes to the AVRO schema file and also drop the table and recreate them using the new schema to work.