I would like to have few Impala Instances on the same cluster.
I was able to create 2 Hive instances (with 2 different hive metastore databases) on the same cluster and it works fine.
Now I would like to create 2 sets of Impala processes one for each Impala instance (Each one will have a separate Hive).
Does Impala support it? What are the implications? What is the best way to implement it?
You can configure Impala to run like this. In a development environment you can run a minicluster which has 3 impalad instances on one machine. Looking at this setup may help you see what to do.
You'll have to configure each of the two impala systems separately.
This table of ports used by Impala will give you some idea of what to configure. You'll also want to configure the memory used by Impalad or each of the two instances on a single machine will try to use 80% of the memory by default.
The only question is why you want to do this? Impala is generally designed to have one impalad instance per machine, and to have features that can isolate workloads or sets of users from one another.