Date/Time: Wednesday, October 14th at 6pm
Location: BrightEdge, San Mateo
Please join us for an Impala deep dive requested by our users at BrightEdge
• 6 - 6:30 Networking
• 6:30 - 7:15 Tech talk: Impala: A Modern, Open-Source SQL Engine for Hadoop with Dimitris Tsirogiannis, Software Engineer, Impala Team
The Cloudera Impala project is pioneering the next generation of Hadoop capabilities: the convergence of fast SQL queries with the capacity, scalability, and flexibility of a Hadoop cluster. With Impala, the Hadoop community now has an open-sourced codebase that helps users query data stored in HDFS, Apache HBase, even Amazon S3 in
real time, using familiar SQL syntax. In contrast with other SQL-on-Hadoop initiatives, Impala's operations are fast enough to do interactively on native Hadoop data rather than in long-running batch jobs.This talk presents a number of lessons and guidelines on how to get the best performance from Impala. It discusses physical design, cluster sizing and hardware recommendations as well as basics in query tuning. Also, it discusses best practices when Impala interacts with other components such as Hive and Sentry.
• 7:15 - 8 Q & A / Community-proposed breakouts