Created on 01-19-2016 01:15 AM - edited 09-16-2022 01:33 AM
Linkedin Post
Presto is a tool designed to efficiently query vast amounts of data using distributed queries.
We will be installing Presto in single server mode, Access Hive and then add worker node.
Cross query - RBDMS, Hive, NoSql
Tutorial
**Java 8 must **
Install - link (for the latest versions)
wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.122/presto-server-0.122.tar.gz
tar xvfz presto-server-0.122.tar.gz
Let's start with Single node setup (master and worker on the same node)
cd presto-server-0.122
mkdir etc
[root@ns2 presto-server-0.122]# cd etc/
mkdir catalog and we will create 3 files as shown below
[root@ns2 etc]# ls
catalog config.properties jvm.config log.properties node.properties
[root@ns etc]# cat config.properties
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=9080
query.max-memory=10GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
discovery.uri=http://ns2:9080
[root@ns2 etc]# cat log.properties
com.facebook.presto=INFO
[root@ns2 etc]# cat node.properties
node.environment=production
node.id=presto1
node.data-dir=/var/presto/data
Details on the properties are here
Now , let's create hive properties file (I have create hive.properties already)
cd catalog/
[root@ns2 catalog]# ls
hive.properties jmx.properties
[root@ns2 catalog]# cat hive.properties
connector.name=hive-hadoop2
hive.metastore.uri=thrift://ns3:9083
hive.config.resources=/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml
All set to start presto server
[root@ns2 bin]# pwd
/root/presto-server-0.122/bin
[root@ns2 bin]# nohup ./launcher run &
[1] 11722
[root@ns2 bin]# nohup: ignoring input and appending output to `nohup.out'
[root@ns2 bin]# tail -f nohup.out
last line will be
2015-10-18T16:49:49.935-0400 INFO main com.facebook.presto.metadata.CatalogManager -- Added catalog hive using connector hive-hadoop2 --
2015-10-18T16:49:50.005-0400 INFO main com.facebook.presto.server.PrestoServer ======== SERVER STARTED ========
hit http://host:9080
Let's access Hive tables
Download presto cli ( link for the latest release)
mv presto-cli-0.122-executable.jar presto
[root@ns2 bin]# ./presto --server ns2:9080 --catalog hive
presto> show tables from default;
Create a table in Hive
Presto UI
click one of the queries to check the stats.
Click Execution link to get execution plan
Let's add worker node and remove master from the worker
Node name - ns4
Repeat installation steps in new node as mentioned above then make following changes
/root/presto-server-0.122/etc
[root@ns4 etc]# cat config.properties
coordinator=false
discovery.uri=http://ns2:9080 (It points to master server)
[root@ns4 etc]# cat node.properties (node.id needs to be unique)
node.id=presto2
[root@ns4 etc]# cd ..
[root@ns4 presto-server-0.122]# cd bin/
[root@ns4 bin]# nohup ./launcher run &
[root@ns02 bin]# ./presto --server ns2:9080 --catalog hive
Happy Hadooping!!!
Read
Presto: Interacting with petabytes of data at Facebook
Created on 11-30-2018 12:21 AM
Hi Neeraj,
I am able to install presto but my queries are failing. Did you face similar error before?
presto:default> show tables; Query 20181130_001533_00002_5gf9c failed: 10.xxx.xx.xx: null presto:default> exit Caused by: org.apache.thrift.transport.TTransportException: 10.xxx.xx.xx:: null at com.facebook.presto.hive.HiveMetastoreClientFactory.rewriteException(HiveMetastoreClientFactory.java:58) at com.facebook.presto.hive.HiveMetastoreClientFactory.access$000(HiveMetastoreClientFactory.java:33)