Support Questions

Find answers, ask questions, and share your expertise

How do I integrate presto cluster to hadoop cluster?

avatar

We have Hadoop cluster based on ambari Since thrift server have poor performance , we decided to replace it with presto Our current Hadoop cluster have the following machines 960 data node machines ( based on redhat 7 OS )

Few words about the presto- Presto (or PrestoDB) is an open source, distributed SQL query engine, designed from the ground up for fast analytic queries against data of any size. It supports both non-relational sources, such as the Hadoop Distributed File System (HDFS),

We installed the new presto server as the following First we installed the OS ( redhat 7 ) , total 13 machines 1 machine for the presto coordinator And 12 machines for presto workers

After installing the OS We installed successfully the presto ( presto coordinator + presto workers )

Now we are stuck about how to do the integration between presto cluster to the Hadoop cluster

I will give short example about hive connector ( hive.properties )

we have the following variable hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml

since this file are located the data node machines and of course not on the presto worker machines , I assume that we need to copy these files from one of the data node machine to the presto workers machines

am I right here ?

Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Michael Bronson

Out of the box configs are much easier but the config you have implemented is the correct way to integrate Presto with hadoop these files must be present on all the presto node 🙂

View solution in original post

1 REPLY 1

avatar
Master Mentor

@Michael Bronson

Out of the box configs are much easier but the config you have implemented is the correct way to integrate Presto with hadoop these files must be present on all the presto node 🙂