Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Druid vs OpenTSDB for tick data

Highlighted

Druid vs OpenTSDB for tick data

New Contributor

Looking for pros and cons of working with real-time tick data using Druid vs OpenTSDB. Tick data must be ingested in real-time at finest resolution possible and provide an interactive store for analytical queries such as slicing and dicing as well as roll-ups and aggregations.

3 REPLIES 3

Re: Druid vs OpenTSDB for tick data

Contributor

why do I have this feeling that someone will just come and say: 'go for the druid'

Re: Druid vs OpenTSDB for tick data

Hi @dbaev while I won't just say "go for druid" I would suggest that you look at the recent slides and video from the DataWorks summit session on Druid and use it to inform your decision.

Slides: https://www.slideshare.net/HadoopSummit/interactive-analytics-at-scale-in-apache-hive-using-druid

Video of presentation: https://www.youtube.com/watch?v=OpuTAOCxq1k

One additional element that may be relevant is that while it's Tech Preview right now in HDP 2.6, it will be supported in the future by Hortonworks, so if having a fully supported solution from a single vendor is also important to you, that might also play a part.

Hope that helps!

Re: Druid vs OpenTSDB for tick data

Expert Contributor

Yes Go for druid ! I want to start with disclaimer i am a druid committer. First i want to point that as an engineer i don't believe that there is a single query engine that can be always be better that all the other solutions, it is all relative to the use case you want to solve. Now let's get to why Druid and not OpenTSDB for real-time stream application ? Therefore the use case keyword here is real time streaming applications.

Well for the simple reasons are:

  • Druid has native ingestion and indexing support with almost all the rising real time stream processing technologies (eg kafka, rabitMQ, spark, storm, flink, apex, ... and the list goes on and on).
  • This integration is production tested at a very very large scale (eg Yahoo-Flurry or Metamarket) where we have more than 1 million events per second through real-time ingestion.
  • Druid out of the box has support for lambda architecture.
  • Druid can ingest data directly from Kafka with the guaranty of exactly once delivery semantic.

In my opinion those are the key element to look for when i am building realtime streaming application. To my limited knowledge i am not aware if there is any integration or production use cases with real time streams and OpenTSDB.

Don't have an account?
Coming from Hortonworks? Activate your account here