Support Questions
Find answers, ask questions, and share your expertise

how to store data in druid ?

how to store data in druid ?

Expert Contributor

Hi,

I want to store data i druid. how can i do this ? i am new to druid.

How can i see the data stored in druid ?

Thank you

3 REPLIES 3

Re: how to store data in druid ?

@heta desai

You can load data into druid in a few ways:

  • Native batch (Druid loads data directly from S3, HTTP, NFS, or other networked storage.)
  • Hadoop (Druid launches Hadoop Map/Reduce jobs to load data files)
  • Kafka indexing service (Druid reads directly from Kafka.)
  • Tranquility (You use Tranquility, a client side library, to push individual records into Druid.)

More Informations here:

http://druid.io/docs/latest/ingestion/index.html

Example for Hadoop Batch Ingestion:

You have a example JSON File (pageviews.json) like that:

{"time": "2015-09-01T00:00:00Z", "url": "/foo/bar", "user": "alice", "latencyMs": 32}
{"time": "2015-09-01T01:00:00Z", "url": "/", "user": "bob", "latencyMs": 11}
{"time": "2015-09-01T01:30:00Z", "url": "/foo/bar", "user": "bob", "latencyMs": 45}

Then you have to build a json file (my-index-task.json) which explains how to load the data (look attachments). Then put the example json File into the hdfs.

Finally set up a HTTP Request to your Druid Overload Server to perform the operation:

curl -X 'POST' -H 'Content-Type:application/json' -d @my-index-task.json druidoverlordserver:8090/druid/indexer/v1/task

If you want to query the stored data, you can also do this with HTTP Requests. Choose the right Querying Typ for your operation.

More informations here:

http://druid.io/docs/latest/querying/querying

Here is an example for the timeseries query (query-file.json):

{
	"queryType": "timeseries",
	"dataSource": "pageviews.json",
	"granularity": {"type": "duration", "duration": 3600000},
	"descending": "true",
	"dimensions" : ["user"],
	"aggregations": [
		{
			"type": "count",
			"name": "totalCount"
			}],
	"intervals": [ "2015-09-01T00:00:00.000Z/2015-09-02T00:00:00.000Z" ]
}

Then set up the HTTP Request to your Druid Broker Server:

curl -X POST 'druidbrokerserver:8082/druid/v2/?pretty' -H 'Content-Type:application/json' -d @query-file.json

It is also possible to you the Hive SQL Layer to query data stored in druid via SQL-Statements. If you can do that, you have to enable the HiveServer2 Interactive. Then you can create an external table with the DruidStorageHandler, and refer to the existing druid table:

CREATE EXTERNAL TABLE druid_wikiticker
STORED BY 'org.apache.hadoop.hive.druid.DruidStorageHandler'
TBLPROPERTIES ("druid.datasource" = "wikiticker");

I hope I could helped. Please go to http://druid.io/ for more information.

Regards,

Michael

Re: how to store data in druid ?

Expert Contributor

@Michael Graml I have csv files in HDFS. I want to ingest them to druid..
Here,
curl -X 'POST'-H 'Content-Type:application/json'-d @my-index-task.json druidoverlordserver:8090/druid/indexer/v1/task

do i give the hdfs path of my files in -d paramter ?

Re: how to store data in druid ?

Expert Contributor

@Michael Graml

How to create index.json file ?