Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar
Master Mentor

Apache Drill is an open source, low-latency query engine for Hadoop that delivers secure, interactive SQL analytics at petabyte scale. With the ability to discover schemas on-the-fly, Drill is a pioneer in delivering self-service data exploration capabilities on data stored in multiple formats in files or NoSQL databases. By adhering to ANSI SQL standards, Drill does not require a learning curve and integrates seamlessly with visualization tools. Source

Tutorial:

Grab latest version of drill

wget http://getdrill.org/drill/download/apache-drill-1.2.0.tar.gz tar xvfz apache-drill-1.2.0.tar.gz /root/drill/apache-drill-1.2.0

[root@node1 apache-drill-1.2.0]# cd bin/

Start Drill in distributed mode ( You can start in embedded mode too)

[root@node1 conf]# ../bin/drillbit.shstart

starting drillbit, logging to /root/drill/apache-drill-1.2.0/log/drillbit.out

Drill Web Console - http://host:8047

Enable storage plugins

Click Storage --> Enable hbase,hive,mongo

Modify storage plugin for Hive and HBase as per your Hadoop cluster setup

for example: click update for hive under storage plugins

modify hive.metastore.uris

Hive Test: launch hive shell

hive> create table drill_hive ( info string);

hive> insert into table drill_hive values ('This is Hive and you are using Drill');

[root@node1 bin]# ./drill-conf

apache drill 1.2.0

"start your sql engine"

0: jdbc:drill:> use hive;

0: jdbc:drill:> select info from drill_hive;

HBase

Change storegae properties in case HBase zookeeper.znode.parent points to /hbase-unsecure

add "zookeeper.znode.parent": "/hbase-unsecure"

Let's check the query plan and metrics

Click Profile

You will see queries under completed queries.

Click the Query to see the query execution stats.

What is foreman? Link

More information

For ODBC/JDBC setup

Happy Hadoooping!!!!

9,915 Views
Comments
avatar
Contributor

Hi Neeraj - Thanks for the post. Since current HDP doesn't ship with Drill , would it be reasonable to expect using the Drill ODBC Driver for HBASE Connectivity from Excel and other BI tools ? ( though its more of a MAPR Focus?) I would rather maintain one ODBC Driver vs having Drill for HBASE and HDP ODBC for Hive.

avatar
Rising Star

@Davide Vergari just released a custom Ambari service to install Apache Drill thru Ambari on HDP 2.4. Feel free to take a look at it.