Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Impala Assignment Locality

Highlighted

Impala Assignment Locality

Rising Star

Hi,

I receive some warnings about Impala Assisgment Locality falling below the set threshold.

My understanding is that this happens when a query may need to use data from two different sources, which are possibly stores on different data nodes. 

If I know two tables are regualry going to be used in a join clause, is there away I can force hadoop - HDFS to store the data blocks for these tables on the same datanode?  This would seen to not be the best approach to me though.

I guess my question is i'm unsure how to imporve assigment locality, as I have impalad running on each datanode.