Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: The Cloudera Community will undergo maintenance on Saturday, August 17 at 12:00am PDT. See more info here.

Impala AVRO schema throws error while Querying

SOLVED Go to solution
Highlighted

Impala AVRO schema throws error while Querying

Explorer

I have an AVRO schema which works and querries fine on HIVE, but when we query the same on Impala it thorws an error saying 

"Your query has the following error(s):

Could not connect to <HOST NAME>:21050"

 

Which is weird and i dont see any log information on Hue as i am able to browse other tables that are defined on TEXT format within same database.

 

we are on impala-2.1.3+cdh5.3.3+0

 

One thing which looks wired on my schema when i decsribe it on Impala is it has the first row as NULL like below and i am not sure why!

 

Impala1.jpeg

 

 

My AVRO schema file is defined like below:

Its just sample and all the fields are defined as STRING 

 

{ "namespace": "HDFS", "name": "sample", "type": "record", "fields":[{"name":"recipient_identification_number", "type":"string"}, {"name":"validation_digit", "type":"string"}, {"name":"eligibility_status_code", "type":"string"}, {"name":"recipient_last_name", "type":"string"}, {"name":"recipient_first_name", "type":"string"}, {"name":"recipient_middle_name", "type":"string"}, {"name":"recipient_name_appel", "type":"string"}, {"name":"recipient_dob", "type":"string"}, {"name":"recipient_ssn", "type":"string"}, {"name":"recipient_ssn_prefix", "type":"string"}, {"name":"recipient_race_code", "type":"string"},
Imapa.jpeg


PS: Not sure why my pics have these wired color.
1 ACCEPTED SOLUTION

Accepted Solutions

Re: Impala AVRO schema throws error while Querying

Master Collaborator

Thanks for following up!

 

I'm pretty sure your table shoud work on more recent versions of Impala since we've fixed several Avro issues related to how schemas are defined.

 

As a workaround, you could try the following things:

1. In your .avsc file make all fields nullable by specifying the types a a union of null and the type like this:

type:["null", "int"]

2. Also specify corresponding matching column definitions in your CREATE TABLE, i.e.

 

CREATE TABLE MI_FULL (col1 INT, col2 STRING, )

ROW FORMAT SERDE

(the rest is exactly the same as before)

 

Let me know if you have questions and whether those workarounds helped!

10 REPLIES 10

Re: Impala AVRO schema throws error while Querying

Cloudera Employee

I would suggest looking in the log directory to see if you see any crash information there in impalad.INFO or impalad.FATAL. If so, can you please share them ?

Re: Impala AVRO schema throws error while Querying

Explorer

I did kwho, i dont see any entries on the files you have listed.

Re: Impala AVRO schema throws error while Querying

Master Collaborator

It looks like your table metadata is in a strange state. How did you create the table exactly? Did you alter the table (e.g. add/remove columns)?

 

Re: Impala AVRO schema throws error while Querying

Explorer

Sorry for replying late.

 

Alex,

 

I dint alter the table or columns, this is how i created the table.

 

CREATE TABLE MI_FULL
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
WITH SERDEPROPERTIES
('avro.schema.url'='hdfs://path/filename.avsc')
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES
('avro.schema.url'=hdfs://path/filename.avsc)
;

 

And then did an insert into this table, please let me know if there is something wrong with creating like this. 

Re: Impala AVRO schema throws error while Querying

Master Collaborator

Thanks for following up!

 

I'm pretty sure your table shoud work on more recent versions of Impala since we've fixed several Avro issues related to how schemas are defined.

 

As a workaround, you could try the following things:

1. In your .avsc file make all fields nullable by specifying the types a a union of null and the type like this:

type:["null", "int"]

2. Also specify corresponding matching column definitions in your CREATE TABLE, i.e.

 

CREATE TABLE MI_FULL (col1 INT, col2 STRING, )

ROW FORMAT SERDE

(the rest is exactly the same as before)

 

Let me know if you have questions and whether those workarounds helped!

Re: Impala AVRO schema throws error while Querying

Explorer

I am trying out the options you have suggested alex, i should have my results mostly by today.

Re: Impala AVRO schema throws error while Querying

Explorer

Voila its works fine they way you asked me to define the tables to get to view the data on Impala.

Thanks much Alex.

 

 

 

 

Re: Impala AVRO schema throws error while Querying

Master Collaborator

Thanks for following up and confirming that it works!

Re: Impala AVRO schema throws error while Querying

Explorer

But Just want to make a note that it defeates the purpose of using AVRO schema as for any schema changes we will have to make changes to the AVRO schema file and also drop the table and recreate them using the new schema to work.