We have 2-node cluster(1 master 4 CPU,16 GB RAM + 1 data node 8 CPU,30 GB RAM) and the estimated amount of data being processed through HIVE tables are 100 GB. We are using Ambari Hive 2.0 view instance running in Master and the estimated number of support/analytics users are around 15-20. When we try to access the HIVE instance differently for each user (per session), all HIVE queries (using Tez) are processed via YARN default queue. However the expectation is to get the HIVE results in parallel for each session, but these Tez jobs are executed in sequence and the performance is major constraint here. We dont want to add more nodes as the data being processed is still in GBs and we wanted to improve the parallelism in HIVE query execution with the current hardware configuration. We have also applied tuning parameters related to HIVE such as et hive.cbo.enable=true;
set hive.compute.query.using.stats=true;
set hive.stats.fetch.column.stats=true;
set hive.stats.fetch.partition.stats=true; along with converting the table into ORC format. Even then the performance of query response time and parallelism are not improved. Any help related to this,highly appreciated. Thanks!!!