Reply
Explorer
Posts: 9
Registered: ‎04-06-2016

Exercise 2 : Impala doesn't support tables of this type,

I'm running Quickstart cdh 5.5

Everything went well in second exercise until I tried to query created tables in Impala. they had 0 counts.  I tried to find what went wrong and found an error message: Failed to load metadata for table intermediate_access_logs.... REASON : SerDe library 'org.apache.hive.contrib.serde2.RegexSerDe' ia not supported. I can query hive tables and get the sought result. So, how it can be done in different way in Impala? Can I hope during exam I'll have no such problems?

 

Thank you.

Highlighted
Cloudera Employee
Posts: 435
Registered: ‎07-12-2013

Re: Exercise 2 : Impala doesn't support tables of this type,

intermediate_access_logs is defined using Hive serializer / deserializers
that Impala does not support. You should be able to query
tokenized_access_logs in Impala. The point of Exercise 2 is to demonstrate
the different strengths between Hive and Impala: Impala queries execute
*way* faster than Hive queries, but Hive has a bigger community of more
dynamic user-defined functions and data formats, etc. This makes Hive a
good tool for cleaning or transforming your data in batches to prepare it
for analysis, and Impala makes a much better tool for interactive querying,
analysis of prepared data, etc.
Explorer
Posts: 9
Registered: ‎04-06-2016

Re: Exercise 2 : Impala doesn't support tables of this type,

Thank you, OK, I understood, that with hive we prepare data for Impala.

 

I found that insert overwrite failed on the very last step. Just after map 100%.

I created intermediate table:

drop table if exists tokenized_access_logs_1;

create table tokenized_access_logs_1 as select * from intermediate_access_logs;

and then issued :

insert overwrite tokenized_access_logs as select * from tokenized_access_logs_1;

And it worked, I could finish my exercise 2. I wonder what is the cause of failure.

 

Thank you.