on 04-29-2016 02:47 PM
Date and Time: May 5th at 6pm
Join us for a meetup during Philly Tech Week! This is your chance to ask deep technical questions and network with other users.
Talk Description: Historically, use cases such as time series and mutable-profile datasets have been possible but difficult to achieve efficiently using traditional HDFS storage engines. These solutions might involve complex ingestion paths, deep understanding of file types, and compaction strategies. With the introduction of Kudu, many of these difficulties are eliminated. At the same time, interest in streaming solutions and low-latency analytics has surged with the growing popularity of tools like Apache Kafka.
Ted Malaska will explain how to go from zero to full-on time series and mutable profile systems in 40 minutes. Ted will cover code examples of ingestion from Kafka and Spark Streaming and access through SQL, Spark, and Spark SQL to explore the underlying theories and design patterns that will be common for most solutions with Kudu.
Speaker: Ted Malaska is a solutions architect at Cloudera and has worked on close to 100 clusters for over two-dozen clients with over hundreds of use cases. Ted has 18 years of professional experience working for startups, the US government, a number of the world’s largest banks, commercial firms, bio firms, retail firms, hardware appliance firms, and the largest nonprofit financial regulator in the US. He has architecture experience across topics such as Hadoop, Web 2.0, mobile, SOA (ESB, BPM), and big data. Ted is a regular contributor to the Hadoop, HBase, and Spark projects, a regular committer to Flume, Avro, Pig, and YARN, and the coauthor of O’Reilly Media’s Hadoop Application Architectures.