Support Questions

Find answers, ask questions, and share your expertise

How to monitor and provision resources for Phoenix queries

Expert Contributor

I need to map YARN queues to analytic phoenix queries. That is, when

user1 runs 

"select col1, col2 from table1"

user1 runs it in yarn-queue1 and that way we can make sure adequate resources is given to the queue to avoid  contention with lower priority jobs.

However, I have tested with the

yarn.scheduler.capacity.queue-mappings-override.enable=true

yarn.scheduler.capacity.queue-mappings=u:user1:queue1,g:group1:queue2

parameters but i am not able to see the queries that are executed in YARN.

How do i make sure these queries are catered for?

N.B: We use squirrel to connect to Phoenix.

1 REPLY 1

@Joshua Adeleke

Unfortunately, Phoenix (HBase) is not yet integrated in YARN by default (HDP 2.6.5 or lower, 3.x I don't know). So the ressources of HBase loads are managed outside of YARN. That's the reason why it is recommended that the YARN Property yarn.nodemanager.resource.memory-mb should not set to all your cluster ressources. There should be enough space for HBase workloads and things like the Operating System. However, there are some workarounds to manage Phoenix/HBase Ressources.

  • Manage in YARN
    • Use Spark with Phoenix JDBC Connector - Spark runs on YARN, but i think it's a little bit difficult to run interactive queries
    • Use Apache - Slider “slides” existing long-running services like Apache HBase onto YARN (but haven't used it yet, so I don't know how it works)
    • Using Hoya - https://de.hortonworks.com/blog/introducing-hoya-hbase-on-yarn/
  • Manage outside of YARN
    • Use cgroups in Linux

More ressources:

https://de.slideshare.net/Hadoop_Summit/multitenant-multicluster-and-multicontainer-apache-hbase-dep...

Regards,

Michael