Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can Impala be used to run all queries on a website?

Can Impala be used to run all queries on a website?

New Contributor

I've moved a web application from using MySQL to Impala. To avoid synchronization issues, I want to use Impala exclusively, even for very simple lookup queries. For the most part this has been working out great, as a 20 minute query in MySQL is now running in 2 seconds on Impala.

 

However, each page that loads can fire around 10-15 queries. This adds up as simple Impala lookups take just under a second, and other simple queries can take between 1-2 seconds. Adding that all up each page takes around 20-30 seconds to load.

 

I'm running 4 nodes with 32gb of memory each. Memory utilization stays under 100mb during a page load. Is there anyway to speed up Impala to get these queries even faster? Is using Impala to run an entire website not a realistic application for it?

 

Thanks!

4 REPLIES 4
Highlighted

Re: Can Impala be used to run all queries on a website?

Contributor

What are the SLA's you need? Can you include the runtime profile for the query you're running to see if 

there some easy things you can try?

Re: Can Impala be used to run all queries on a website?

Hi,

 

Please can you share the mechanism of how you have run Impala queries from a website i.e. have you used the Impala ODBC connector in PHP or JQuery?

Thanks in advance!

 

Re: Can Impala be used to run all queries on a website?

New Contributor

Yogesh:

 

I used this PHP library: https://github.com/rmcfrazier/php_impala_phar

 

We ended up moving to a different technology than Impala, as we could not figure out a way to get the queries down from 1-2 seconds, even on simple queries. 

Re: Can Impala be used to run all queries on a website?

Thanks for your reply. We are following the same approach.

 

Out of interest, which other technology stack are you evaluating please? Based on reports from the Hadoop Word conference, it seemed that Impala was the fastest as compared to any similar Query tool on Hadoop so I was wondering which other toolset could give better results.

 

Thanks!

Don't have an account?
Coming from Hortonworks? Activate your account here