Member since
09-18-2015
3274
Posts
1159
Kudos Received
426
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2625 | 11-01-2016 05:43 PM | |
| 8759 | 11-01-2016 05:36 PM | |
| 4925 | 07-01-2016 03:20 PM | |
| 8268 | 05-25-2016 11:36 AM | |
| 4434 | 05-24-2016 05:27 PM |
01-19-2016
12:36 PM
8 Kudos
Apache Drill is an open source, low-latency query engine for Hadoop that delivers secure, interactive SQL analytics at petabyte scale. With the ability to discover schemas on-the-fly, Drill is a pioneer in delivering self-service data exploration capabilities on data stored in multiple formats in files or NoSQL databases. By adhering to ANSI SQL standards, Drill does not require a learning curve and integrates seamlessly with visualization tools. Source Tutorial: Grab latest version of drill wget http://getdrill.org/drill/download/apache-drill-1.2.0.tar.gz
tar xvfz apache-drill-1.2.0.tar.gz
/root/drill/apache-drill-1.2.0 [root@node1 apache-drill-1.2.0]# cd bin/ Start Drill in distributed mode ( You can start in embedded mode too) [root@node1 conf]# ../bin/drillbit.shstart starting drillbit, logging to /root/drill/apache-drill-1.2.0/log/drillbit.out Drill Web Console - http://host:8047 Enable storage plugins Click Storage --> Enable hbase,hive,mongo Modify storage plugin for Hive and HBase as per your Hadoop cluster setup for example: click update for hive under storage plugins modify hive.metastore.uris Hive Test: launch hive shell hive> create table drill_hive ( info string); hive> insert into table drill_hive values ('This is Hive and you are using Drill'); [root@node1 bin]# ./drill-conf apache drill 1.2.0 "start your sql engine" 0: jdbc:drill:> use hive; 0: jdbc:drill:> select info from drill_hive; HBase Change storegae properties in case HBase zookeeper.znode.parent points to /hbase-unsecure add "zookeeper.znode.parent": "/hbase-unsecure" Let's check the query plan and metrics Click Profile You will see queries under completed queries. Click the Query to see the query execution stats. What is foreman?
Link More information For ODBC/JDBC setup Happy Hadoooping!!!!
... View more
Labels:
01-19-2016
11:33 AM
@Mehdi TAZI Very good point. It goes back to ELT ..Source of truth "raw data" lands in HDFS, we run transformations on that data and load into Hive or HBASE based on used case. There is significant cost difference in storing the source of truth in Hadoop vs. Expensive SAN or EDW. You don't have to store in HDFS. You can load data directly into Hive or HBase tables. The very basic use case i,e Data archival. You can "move" data from EDW into Hive using sqoop. Data goes directly into hive tables.
... View more
01-19-2016
11:16 AM
@Andrea Squizzato You will have to open support ticket for this. Can share entries from log files to look further?
... View more
01-19-2016
11:14 AM
@Mehdi TAZI No and I never heard of duplicating the data with Parquet. I hope you are not referring to HDFS replication factor. If you are then please see this
... View more
01-19-2016
11:10 AM
5 Kudos
@Zeev Lazarev Use root does not have access to create directory under / You can copy and paste this in your ssh window su - hdfs hdfs dfs -mkdir -p /mp2/links hdfs dfs -chown -R root:hdfs /mp2/links exit
... View more
01-19-2016
10:55 AM
2 Kudos
@Mehdi TAZI
I am big fan of orc http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
... View more
01-19-2016
10:51 AM
1 Kudo
@John Smith Generally, you don't have to do anything except importing the sandbox image. 1) Networks 2) This is setup during the install.
... View more
01-19-2016
04:04 AM
@Kumar Ratan System is running out of memory Hive is trying to create tez container and system does not have enough Memory Check the vm memory and see if you can increase it
... View more
01-19-2016
02:32 AM
@Benson Shih check this https://github.com/abajwa-hw/security-workshops/blob/master/Setup-ranger-23.md#setup-kafka-plugin-for-ranger
... View more