Support Questions
Find answers, ask questions, and share your expertise

HDP performance

HDP performance

New Contributor

I'm trying to go through the tutorials to load the trucks and geolocation data, and then use Hive to create tables and run queries on the tables. When I tried this first on the Docker container-based approach, I could get things to run but the queries against the derived tables would take a a very long time, and usually timeout/fail.

So then -- thinking that my development laptop was the bottleneck -- I spun up a CloudFormation instance of Cloudbreak and tried to do the same steps. Unfortunately, even the Cloudbreak instance running in AWS had unacceptable query performance. I read about tuning, but I shouldn't have to do any tuning when using the tutorial.

I feel like I'm missing something -- any advice on what to do to make queries run better? I'm back to running on my local dev environment with the docker instance.

Don't have an account?