Posts: 58
Registered: ‎04-26-2017

Impala Assignment Locality


I receive some warnings about Impala Assisgment Locality falling below the set threshold.

My understanding is that this happens when a query may need to use data from two different sources, which are possibly stores on different data nodes. 

If I know two tables are regualry going to be used in a join clause, is there away I can force hadoop - HDFS to store the data blocks for these tables on the same datanode?  This would seen to not be the best approach to me though.

I guess my question is i'm unsure how to imporve assigment locality, as I have impalad running on each datanode.