Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive adds timezone to Druid interval question

Hive adds timezone to Druid interval question

Contributor

I’m struggling with getting Hive to work with Druid. So far, I’ve got a working connection between the two and I can create datasources in Druid from Hive, and I can also (with some workaround) query the data from Hive. But not using the “avg” aggregation on one of the measures. If I do an explain on the query in hive, I get the following Json (after making it a bit easier to read)

{ "queryType":"select", "dataSource":"druid_sqoop_import_02", "descending":false, "intervals":[ "1900-01-01T00:00:00.000/3000-01-01T00:00:00.000" ], "filter":{ "type":"selector", "dimension":"target_database", "value":"ABC" }, "dimensions":[ "import_type", "target_database", "target_tblname", "source_database", "source_tblschema", "source_tblname" ], "metrics":[ "sqoop_duration", "merge_duration", "concatenate_duration", "sqoop_mappers", "sqoop_rows", "sqoop_size" ], "granularity":"all", "pagingSpec":{ "threshold":16384, "fromNext":true }, "context":{ "druid.query.fetch":false } }

If I run a normal Json REST call to druid, this query returns the values correctly. But from Hive, I get the following error.

java.io.IOException: org.apache.hive.druid.com.fasterxml.jackson.core.JsonParseException: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')

If I look at the actual error message returned from the Druid Broker, it says the following.

<html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/> <title>Error 500 </title> </head> <body> <h2>HTTP ERROR: 500</h2> <p>Problem accessing /druid/v2/datasources/druid_sqoop_import_02/candidates. Reason: <pre> java.lang.IllegalArgumentException: Invalid format: "1900-01-01T00:00:00.000 01:00" is malformed at " 01:00"</pre></p> <hr /><i><small>Powered by Jetty://</small></i> </body> </html>

So for some reason, it’s trying to add what I guess is the timezone as I’m living in UTC+1 to the end of the timestamp. I have also tried to do a where on __time in Hive, but it still adds the “01:00” at the end. Have anybody else had these problems, or can someone give me some pointers on how to solve this one?

3 REPLIES 3

Re: Hive adds timezone to Druid interval question

Contributor

An update

What Hive is doing is calling the /druid/v2/datasources/<DATASOURCE_NAME>/candidates?intervals=1900-01-01T00:00:00.000+01:00/3000-01-01T00:00:00.000+01:00 interface to get a list of candidates. According to the broker documentation (http://druid.io/docs/latest/design/broker.html), the intervals should support ISO8601 standard for timestamps and that includes the “+00:01” at the end for timezone.

When I call the Broker directly with a simple curl command, I get the following response back from the broker.

<html>

<head>

<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>

<title>Error 500 </title>

</head>

<body>

<h2>HTTP ERROR: 500</h2>

<p>Problem accessing /druid/v2/datasources/<DATASOURCE_NAME>/candidates. Reason:

<pre> java.lang.IllegalArgumentException: Invalid format: "1900-01-01T00:00:00.000 01:00" is malformed at " 01:00"</pre></p>

<hr /><i><small>Powered by Jetty://</small></i>

</body>

</html>

If I remove the timezone in the request (/druid/v2/datasources/<DATASOURCE_NAME>/candidates?intervals=1900-01-01T00:00:00.000/3000-01-01T00:00:00.000), the candidates returns correctly.

Re: Hive adds timezone to Druid interval question

Contributor
Highlighted

Re: Hive adds timezone to Druid interval question

New Contributor
Don't have an account?
Coming from Hortonworks? Activate your account here