Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Impala Assignment Locality


Impala Assignment Locality

Rising Star


I receive some warnings about Impala Assisgment Locality falling below the set threshold.

My understanding is that this happens when a query may need to use data from two different sources, which are possibly stores on different data nodes. 

If I know two tables are regualry going to be used in a join clause, is there away I can force hadoop - HDFS to store the data blocks for these tables on the same datanode?  This would seen to not be the best approach to me though.

I guess my question is i'm unsure how to imporve assigment locality, as I have impalad running on each datanode.