Member since
09-18-2015
191
Posts
81
Kudos Received
40
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2054 | 08-04-2017 08:40 AM | |
5425 | 05-02-2017 01:18 PM | |
1112 | 04-24-2017 08:35 AM | |
1116 | 04-24-2017 08:21 AM | |
1344 | 06-01-2016 08:54 AM |
08-17-2018
01:30 PM
1 Kudo
Episode 95 – DataWorks Summit in San Jose with Ward Bekker https://roaringelephant.org/2018/07/03/episode-95-dataworks-summit-in-san-jose-with-ward-bekker/ Since both Dave and Jhon were not able to attend the DataWorks Summit
in San Jose a couple of weeks ago, we have a guest, Ward Bekker, who
was happy to join and educate us on the subject.
Play in new window | Download (Duration: 1:52:50 — 77.7MB) In
this episode we discuss the daily keynotes and Wards’ selection of
sessions at the Summit ranging from the new things in Yarn 3.0,
Materialized views in Hive and much more. Ward Bekker (Linkedin) Pre-Sales Solutions Engineer II @ Hortonworks Some of the sessions and topics discussed are: Apache Hadoop State of the union
https://dataworkssummit.com/san-jose-2018/session/apache-hadoop-yarn-state-of-the-union-2/ What is new in Apache Hive
https://dataworkssummit.com/san-jose-2018/session/what-is-new-in-apache-hive/ Runing distributed tensorflow in production
https://dataworkssummit.com/san-jose-2018/session/running-distributed-tensorflow-in-production-challenges-and-solutions-on-yarn-3-0-2/ Just the sketch: advanced streaming analytics in Apache Metron
https://dataworkssummit.com/san-jose-2018/session/just-the-sketch-advanced-streaming-analytics-in-apache-metron/ Containers and Big Data
https://dataworkssummit.com/san-jose-2018/session/containers-and-big-data/ Catch a hacker in realtime: Live visuals of bots and bad guys
https://dataworkssummit.com/san-jose-2018/session/catch-a-hacker-in-realtime-live-visuals-of-bots-and-bad-guys/ HDFS tiered storage
https://dataworkssummit.com/san-jose-2018/session/hdfs-tiered-storage/ Geospatial data platform at Uber
https://dataworkssummit.com/san-jose-2018/session/geospatial-data-platform-at-uber/ What’s the Hadoop-la about Kubernetes?
https://dataworkssummit.com/san-jose-2018/session/whats-the-hadoop-la-about-kubernetes/
... View more
08-16-2018
12:57 PM
Episode 94 – Roaring news https://roaringelephant.org/2018/06/26/episode-94-roaring-news/ I
this weeks edition of Roaring Big Data News, Dave talks about
modernizing Hadoop and a billion java errors. Jhon has an article on
improving your learning data sets. We finish with a discussion about the
newly released HDP 2.6.5 with an emphasis on the deprecation notices
and Yarn Containers.
Play in new window | Download (Duration: 37:40 — 26.1MB) Dave
Modernizing Hadoop: Reaching the plateau of productivity
https://www.zdnet.com/article/modernizing-hadoop-reaching-the-plateau-of-productivity/ 1 billion Java errors, here’s what causes 97% of them
https://blog.takipi.com/we-crunched-1-billion-java-logged-errors-heres-what-causes-97-of-them/ https://blog.takipi.com/the-top-10-exceptions-types-in-production-java-applications-based-on-1b-events/ Jhon
Why you need to improve your training data, and how to do it
https://petewarden.com/2018/05/28/why-you-need-to-improve-your-training-data-and-how-to-do-it/amp/ Announcing the General Availability of Hortonworks Data Platform (HDP) 2.6.5, Apache Ambari 2.6.2 and SmartSense 1.4.5
https://hortonworks.com/blog/announcing-general-availability-hortonworks-data-platform-hdp-2-6-5-apache-ambari-2-6-2-smartsense-1-4-5/ Component Versions
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_release-notes/content/comp_versions.html Deprecation Notices
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_release-notes/content/deprecated_items.html YARN Containers
Trying out Containerized Applications on Apache Hadoop YARN 3.1
https://hortonworks.com/blog/trying-containerized-applications-apache-hadoop-yarn-3-1/ Containerized Apache Spark on YARN in Apache Hadoop 3.1
https://hortonworks.com/blog/containerized-apache-spark-yarn-apache-hadoop-3-1/
... View more
08-16-2018
12:08 PM
Episode 93 – Apache Kylin: Extreme OLAP Engine for Big Data https://roaringelephant.org/2018/06/19/episode-93-apache-kylin-olap-cubes-in-hadoop/ In
this episode Apache PMC member Dong Li joins us to explains how Apache
Kylin can deploy Analytical OLAP cubes in your Big Data environment. http://kylin.apache.org/
Play in new window | Download (Duration: 46:14 — 32.0MB) Dong Li Technical Partner & Senior Architect of Kyligence (linkedin) PMC Member of Apache Kylin http://en.kyligence.io/ Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
... View more
Labels:
07-16-2018
02:12 PM
Roaring Elephant Podcast – Episode 92 – Roaring news https://roaringelephant.org/2018/06/12/episode-92-roaring-news/ Another
week, another edition of Roaring Big Data News. This time, Dave talks
about driving teens and Jhon takes a detailed look at an Eventbrite data
pipeline article.
Breaking NewsPlay in new window | Download (Duration: 46:08 — 31.9MB) Dave
Driver monitoring isn’t just for teens; adults can benefit, too
https://arstechnica.com/cars/2018/05/buicks-smart-driver-explains-why-my-gas-mileage-sucks-and-my-editors-doesnt/ Jhon
Looking under the hood of the Eventbrite data pipeline!
https://www.eventbrite.com/engineering/looking-under-the-hood-of-the-eventbrite-data-pipeline/
Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
... View more
06-13-2018
11:55 PM
Roaring Elephant Podcast - Episode 91 – ODPi is back and better than ever! https://roaringelephant.org/2018/06/05/episode-91-odpi-is-back-and-better-than-ever/ In
this episode, we welcome back John Mertic, director of Program
Management for ODPi, R Consortium, and the Open Mainframe Project. It’s
been almost two years since we checked in with John and the ODPi
initiative and as John mentions in the interview, a lot has changed in
Hadoop…
Play in new window | Download (Duration: 1:08:00 — 46.9MB) John Mertic Director of Program Management for ODPi, R Consortium, and Open Mainframe Project https://www.linkedin.com/in/jmertic/ ODPi website links:
https://www.odpi.org/ https://www.odpi.org/blog/2018/04/04/the-state-of-open-source-and-big-data-three-years-later https://www.odpi.org/projects/data-governance-pmc https://www.odpi.org/events
Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
... View more
Labels:
06-13-2018
11:28 PM
Roaring Elephant Podcast - Episode 90 – Roaring news https://roaringelephant.org/2018/05/29/episode-90-roaring-news/ In
this weeks Roaring News episode, Dave brings up the resilience of
Apache Community open source projects and plays some Doom. Jhon has some
practical Apache NIFI guides and the emergence of multi modal NoSQL
databases.
Play in new window | Download (Duration: 38:09 — 26.4MB) DataWorks Summit Berlin video recordings are up:
https://www.youtube.com/user/HadoopSummit/playlists Find Dave on his Australian road-trip:
http://bit.ly/aus-nz-ibm-hwx-tour Dave
DataTorrent, Stream Processing Startup, Folds (Apache Apex)
https://www.datanami.com/2018/05/08/datatorrent-stream-processing-startup-folds/ DOOM!
https://arxiv.org/abs/1804.09154 https://www.technologyreview.com/s/611072/ai-generates-new-doom-levels-for-humans-to-play/ https://www.youtube.com/watch?v=K32FZ-tjQP4 Bonus doom news: https://www.rockpapershotgun.com/2018/03/28/dodge-fireballs-forever-in-a-neural-nets-doom-nightmare/ https://worldmodels.github.io/ Jhon
Accessing Feeds from EtherDelta on Trades, Funds, Buys and Sells (Apache NiFi)
https://community.hortonworks.com/articles/191146/accessing-feeds-from-etherdelta-on-trades-funds-bu.html?es_p=6741162 NiFi Processing and Flow with Couchbase Server
https://blog.couchbase.com/nifi-processing-flow-couchbase-server/ The new era of the Multi-Model Database
https://www.zdnet.com/article/the-new-era-of-the-multi-model-database/ Seven Databases in Seven Weeks, Second Edition – A Guide to Modern Databases and the NoSQL Movement
https://pragprog.com/book/pwrdata/seven-databases-in-seven-weeks-second-edition
... View more
Labels:
05-25-2018
11:28 AM
https://roaringelephant.org/2018/05/22/episode-89/ With the San Jose edition of the DataWorks Summit only a month away,
we go over the sessions that are available in the agenda today and offer
our top picks. If you’re going, or if you will be watching the replays
online, we hope to guide you on your selection of sessions.
DataWorks Summit San Jose 2018
Play in new window | Download (Duration: 1:12:20 — 49.9MB) And here is the dashboard we created with statistics on the San Jose sessions, for your enjoyment: https://aka.ms/DWS2018SJ The agenda is still in flux so we will be updating the dashboard regularly.
Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
... View more
05-18-2018
04:57 PM
https://roaringelephant.org/2018/05/15/episode-88-roaring-news/ Returning to our more regular schedule, we have a Roaring News
episode today. Dave has articles on multi-cloud readiness, Big Data
being a pariah, and Google Duplex and Jhon came up with Synthetic data,
data engineers and scientists and a Neural Network sharing cake recipes.
Breaking NewsPlay in new window | Download (Duration: 35:07 — 24.4MB) Dave
Less than 10% ready for multi cloud
http://www.cloudpro.co.uk/cloud-essentials/hybrid-cloud/7451/idc-less-than-10-of-organisations-are-ready-for-multi-cloud Tech companies distancing themselves from Big Data
https://qz.com/1262102/tech-companies-are-distancing-themselves-from-big-data/ Google Duplex
https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-conversation.html Jhon
The Rise of Synthetic Data to Help Developers Create and Train AI Algorithms Quickly and Affordably
https://insidebigdata.com/2018/05/08/rise-synthetic-data-help-developers-create-train-ai-algorithms-quickly-affordably/ Data engineers vs. data scientists
https://www.oreilly.com/ideas/data-engineers-vs-data-scientists?utm_medium=social&utm_source=twitter.com&utm_campaign=awareness&utm_content=radar+content+datascience We asked a neural network to bake us a cake. The results were…interesting.
https://www.popsci.com/neural-network-bakes-a-cake Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
... View more
05-18-2018
04:54 PM
https://roaringelephant.org/2018/05/08/episode-87-druid-high-performance-column-oriented-distributed-data-store-part-2/ This is the second part of an interview with Fangjin Yang, co-founder
and CEO at Imply and committer/PMC member for the Druid project. Druid:
a high-performance, column-oriented, distributed data store which has
entered the Hadoop environment with the recent integration with Apache
and we since Druid has been around for a while, we are grateful to FJ
for spending some time with our listeners.
Play in new window | Download (Duration: 31:53 — 22.1MB) Fangjin Yang Cofounder and CEO at Imply (linkedin)
Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
... View more
Labels:
05-18-2018
04:51 PM
https://roaringelephant.org/2018/05/01/episode-86-druid-a-high-performance-column-oriented-distributed-data-store-part-1/ This
is the first part of an interview with Fangjin Yang, co-founder and CEO
at Imply and committer/PMC member for the Druid project. Druid: a
high-performance, column-oriented, distributed data store which has
entered the Hadoop environment with the recent integration with Apache
and we since Druid has been around for a while, we are grateful to FJ
for spending some time with our listeners.
Play in new window | Download (Duration: 31:57 — 22.2MB) Fangjin Yang Cofounder and CEO at Imply (linkedin)
Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
... View more
Labels: