Created on 12-23-2017 06:48 PM - edited 08-17-2019 09:43 AM
2017 in Review
First off, this was an amazing year for Big Data, IoT, Streaming, Machine Learning and Deep Learning. So many cool events, updates, new products, new projects, new libraries and community growth. I've seen a lot of people adopt and grow Big Data and streaming projects from nothing. Using the power of Open Source and the tools made available by Apache, companies are growing with the help of trusted partners and a community of engineers and users.
We had three awesome DataWorksSummit (Formerly Hadoop Summit, but now a lot more things from IoT, AI and Streaming).
I attended Munich and spoke at Sydney. I missed California, but all the videos and slides were online and I loved those.
I spoke at Oracle Code in NYC which was a fun little event. I was surprised to learn that many people never heard of Apache NiFi or how easily you could use it to build real-time dataflows including Deep Learning and Big Data.
I got to talk to a lot of interesting people while working the Hortonworks Booth at Strata NYC. Such a huge event, fidget spinners and streaming were the main talk away there.
We had a lot of awesome meetups in Princeton and in the NYC and Philadelphia areas. The Princeton Future of Data Group grew to over 750 members! A great community of data scientists, engineers, students, analysts, techies and business thought leaders. I am really proud to be apart of this amazing group.
Meetups
I got to speak at most of the meetups except when we had special guests. I had some great NY/NJ/Philly team mates co-running the meetup: @milind pandit @Greg Keys. Greg and I also created a North Jersey meetup.
November 14th - Enterprise Data at Scale
I spoke on IBM DSX, Apache NiFi, Apache Spark, Python, Jupyter and Data Science. We had two excellent IBM resources assisting me fortunately.
October 5th - Deep Learning with DeepLearning4J (DL4J). A great talk by my friend from SkyMind. It's nice to see their project get accepted to Eclipse.
August 8th - Deep Dive into HDF 3.0 @ Honeywell
June 20th - Latest Innovation -Schema Registry and More. @TRAC Intermodal
May 16th - Hadoop Tools Overview
March 28th - Apache NiFi: Ingesting Enterprise Data at Scale
Libraries, SDKs, Tools, Frameworks
Devices
There were a lot of big news this year, https://hortonworks.com/blog/top-hortonworks-blogs-2017/. Apache Hive LLAP became a real production thing and brought Apache Hadoop into the world of EDW completely Open Source. On the Apache Spark front, we past verison 2.0 and Livy became a production standby and became Apache Livy. The JanusGraph database appeared and is quickly becoming the standard for Graphs. Apache Calcite went into so many projects that SQL queries are everywhere including in Apache NiFi. A huge number of interesting software projects arrised including Hortonworks Data Plane, Hortonworks Schema Registry and Hortonworks Streaming Analytics Manager. This was an awesome year for software.
Presentations From Talks Available
My HCC Articles of 2017
My Articles on DZone
My RefCard
My Guide
My Github Source Code
I have some example Apache NiFi custom processors developed in JDK 8 including ones for TensorFlow, OpenNLP, DL4J, Apache Tika, Stanford CoreNLP and more. I also published all the Python scripts, documentation, Shell scripts, SQL, Apache NiFi Templates and Apache Zeppelin notebooks as Apache licensed open source on Github.
Next year will be amazing, more libraries, more use cases for Deep Learning, enhancements to all the great projects and tools out there. Another Google AIY Kit, more DataWorks Summits, Hadoop 3, HDF 4, HDP 3, so many things to look forward to.
See you at meetups, summits and online next year.