Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Is it possible to pivot (flatten) timestamp-key-value data from a Druid datasource in Hive to expose it to SQL frontends

Highlighted

Is it possible to pivot (flatten) timestamp-key-value data from a Druid datasource in Hive to expose it to SQL frontends

New Contributor

Hi all,

I have sensor data from up to about 3.000 sensors published to a Kafka topic in the format {posix timestamp, sensor ID, float value} where the timestamps are not necessarily strictly increasing (i.e. some of them may send data with some delay, but typically not more than a couple of minutes) which is append-only (no updates) that I want to put into a Druid datasource and then expose it (as fast as possible) via Hive e.g. to a BI frontend like Tableau. Now typically, as an enduser I would strongly prefer the data to be inside one large, flat table having the timestamp and the sensor IDs as columns, and accepting NULL values if the corresponding key-value datapoint has not arrived yet. Now I am not sure if this could be possible in Druid, except routing data corresponding to a sensor ID to its own datasource and creating a union over all these datasources - but for this I feel there are too many different sensor IDs. Does anybody know if it is possible to create a Hive View on such a Druid datasource that has the sensor IDs in the columns and can access the incoming data in near-realtime?

Cheers,

Jonas