Created on 05-28-201804:21 PM - edited 09-16-202201:43 AM
How do you know if your customers (and potential customers) are talking about you on social media?
The key to making the most of social media is listening to what your audience has to say about you, your competitors, and the market in general. Once you have the data you can undertake analysis, and finally, reach social business intelligence; using all these insights to know your customers better and improve your marketing strategy.
the third part of the series of articles on how to ingest social media data
like streaming using the integration of HDP and HDF tools.
implement this article, you first need to implement the previous two.
you should create your table in the Druid adding the Sentiment and Network
fields as described in the previous articles. I called my Druid Table “SocialStalker”
- Once you have done it, just push play to see all data at Druid.
2. Setup Hive-Druid
Our main goal is to be able to index data from Hive
into Druid, and to be able to query Druid datasources from Hive. Completing
this work will bring benefits to the Druid and Hive systems alike:
execution of OLAP queries in Hive. Druid is a system specially
well tailored towards the execution of OLAP queries on event data. Hive will be
able to take advantage of its efficiency for the execution of this type of
a SQL interface on top of Druid. Druid queries are expressed in JSON, and Druid
is queried through a REST API over HTTP. Once a user has declared a Hive table
that is stored in Druid, we will be able to transparently generate Druid JSON
queries from the input Hive SQL queries.
able to execute complex operations on Druid data. There are multiple
operations that Druid does not support natively yet, e.g. joins. Putting Hive
on top of Druid will enable the execution of more complex queries on Druid data
complex query results in Druid using Hive. Currently, indexing in
Druid is usually done through MapReduce jobs. We will enable Hive to index the
results of a given query directly into Druid, e.g., as a new table or a
materialized view (HIVE-10459), and start querying and using
that dataset immediately.
brings benefits both to Apache Druid and Apache Hive like:
Indexing complex query results in Druid using Hive
Introducing a SQL interface on top of Druid
able to execute complex operations on Druid data – Efficient execution of OLAP
queries in Hive
there is an overlap between both if you are using Hive LLAP, it's important to
see each advantage in separated way:
of Druid comes from precise IO optimization, not brute compute force.
queries are performed to drill down on selected dimensions for a given
timestamp predicate for better performance.
queries should use the timestamp predicates; so, druid knows how many segments
to scan. This will yield better results.
Any UDFs or
SQL functions to be executed on Druid tables will be performed by Hive.
Performance of these queries solely depend on Hive. At this point they do not
function as Druid queries.
aggregations over aggregated data are needed, queries will run as Hive LLAP
query not as a Druid query.
hands on Hive-Druid!
perform this, first you need be using Hive Interactive (with LLAP) to use the
Hive Interactive Query
If you do
not have hive-druid-handler in your HDP version, just download it:
can use your beeline terminal, to make any query in your druid table.
You can insert data into your table, make some changes
in your SuperSet Slices (as previous articles) to complete step 3 and see your
Superset Dashboard like this one:
use HDP and HDF to build an end-to-end platform which allow you achieve the
success of your social media marketing campaign as well as the ultimate success
of your business. If you don’t pay attention to how your business is doing, you
are really only doing half of the job. It is the difference between walking
around in the dark and having an illuminated path that allows you to have an
understanding and an awareness of how your business is doing and how you can
continually make improvements that will bring you more and more exposure and a
companies are using social media monitoring to strengthen their businesses.
Those business people are savvy enough to realize the importance of social
media, how it positively influences their businesses and how critical the
monitoring piece of the strategy is to their ultimate success.