Support Questions
Find answers, ask questions, and share your expertise

Hive & Druid Integration Deserialize Error

Expert Contributor

Hi,

Follow the Hortonworks blog and this instruction, I successfully created my Hive table cube index in Druid and ran some simple queries. But I got a deserialization error when running query Q4.2. Has anyone seen this before? I am using HDP2.6.1.

INFO  : We are setting the hadoop caller context from HIVE_SSN_ID:152d59b7-0b88-46b2-b419-a6e457e9a06d to hive_20170731043035_22fc1b1b-6779-430d-a9c4-d31c2dee6eec
INFO  : Semantic Analysis Completed
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:d_year, type:string, comment:null), FieldSchema(name:s_nation, type:string, comment:null), FieldSchema(name:p_category, type:string, comment:null), FieldSchema(name:profit, type:float, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=hive_20170731043035_22fc1b1b-6779-430d-a9c4-d31c2dee6eec); Time taken: 0.34 seconds
INFO  : We are resetting the hadoop caller context to HIVE_SSN_ID:152d59b7-0b88-46b2-b419-a6e457e9a06d
INFO  : Setting caller context to query id hive_20170731043035_22fc1b1b-6779-430d-a9c4-d31c2dee6eec
INFO  : Executing command(queryId=hive_20170731043035_22fc1b1b-6779-430d-a9c4-d31c2dee6eec): select
  d_year, s_nation, p_category,
  sum(net_revenue) as profit
from
  ssb_druid
where
  c_region = 'AMERICA' and
  s_region = 'AMERICA' and
  (d_year = '1997' or d_year = '1998') and
  (p_mfgr = 'MFGR#1' or p_mfgr = 'MFGR#2')
group by
  d_year, s_nation, p_category
INFO  : Resetting the caller context to HIVE_SSN_ID:152d59b7-0b88-46b2-b419-a6e457e9a06d
INFO  : Completed executing command(queryId=hive_20170731043035_22fc1b1b-6779-430d-a9c4-d31c2dee6eec); Time taken: 0.009 seconds
INFO  : OK
Error: java.io.IOException: org.apache.hive.druid.com.fasterxml.jackson.databind.JsonMappingException: Can not deserialize instance of java.util.ArrayList out of START_OBJECT token
 at [Source: org.apache.hive.druid.com.metamx.http.client.io.AppendableByteArrayInputStream@2bf8ee93; line: -1, column: 4] (state=,code=0)
1 ACCEPTED SOLUTION

Accepted Solutions

Expert Contributor

@yjiang am suspecting this NPE is a side effect of running out of memory ? can you share the config of historical ? also i see that the number of allocated byte buffers is at least 77 times 1GB, does this machine has all that free ram ? How many physical cores this machine has ? usually we recommend that the number of intermediate processing buffers is number of physical core - 1 at max.

View solution in original post

5 REPLIES 5

Expert Contributor

@yjiang i have seen this exception happen when the druid broker/historical are returning exceptions instead of query results. The way to debug this, i would send via curl command some test query to druid. You can do an explain on the query you posted that will show you the actual druid query, copy that and send it via curl `

curl -X POST '<queryable_host>:<port>/druid/v2/?pretty' -H 'Content-Type:application/json' -d @<query_json_file>

`

please let me know what is the outcome.

Expert Contributor

@Slim

My query json from hive explain.

{"queryType":"groupBy","dataSource":"ssb_druid","granularity":"all","dimensions":[{"type":"default","dimension":"d_year"},{"type":"default","dimension":"p_category"},{"type":"default","dimension":"s_nation"}],"limitSpec":{"type":"default"},"filter":{"type":"and","fields":[{"type":"or","fields":[{"type":"selector","dimension":"d_year","value":"1997"},{"type":"selector","dimension":"d_year","value":"1998"}]},{"type":"or","fields":[{"type":"selector","dimension":"p_mfgr","value":"MFGR#1"},{"type":"selector","dimension":"p_mfgr","value":"MFGR#2"}]},{"type":"selector","dimension":"c_region","value":"AMERICA"},{"type":"selector","dimension":"s_region","value":"AMERICA"}]},"aggregations":[{"type":"doubleSum","name":"$f3","fieldName":"net_revenue"}],"intervals":["1900-01-01T00:00:00.000/3000-01-01T00:00:00.000"]}<br>

And my curl output.

curl -X POST 'http://host:8082/druid/v2/?pretty' -H 'Content-Type:application/json' -d @./q4.2-druid.json
{
  "error" : "Unknown exception",
  "errorMessage" : "Failure getting results from[http://druid_host:8083/druid/v2/] because of [Invalid type marker byte 0x3c for expected value token\n at [Source: java.io.SequenceInputStream@1f0ff6f5; line: -1, column: 1]]",
  "errorClass" : "com.metamx.common.RE",
  "host" : null
}
<br>

Expert Contributor

Also got exception in my druid historical log.

2017-08-02T03:08:47,572 INFO [groupBy_ssb_druid_[1997-09-01T00:00:00.000Z/1997-10-01T00:00:00.000Z]] io.druid.offheap.OffheapBufferGenerator - Allocating new intermediate processing buffer[77] of size[1,073,741,824]
2017-08-02T03:08:47,573 ERROR [processing-6] io.druid.query.GroupByMergedQueryRunner - Exception with one of the sequences!
java.lang.NullPointerException
	at io.druid.query.aggregation.DoubleSumAggregatorFactory.factorize(DoubleSumAggregatorFactory.java:59) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.segment.incremental.OnheapIncrementalIndex.factorizeAggs(OnheapIncrementalIndex.java:222) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.segment.incremental.OnheapIncrementalIndex.addToFacts(OnheapIncrementalIndex.java:186) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.segment.incremental.IncrementalIndex.add(IncrementalIndex.java:492) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.query.groupby.GroupByQueryHelper$3.accumulate(GroupByQueryHelper.java:127) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.query.groupby.GroupByQueryHelper$3.accumulate(GroupByQueryHelper.java:119) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at com.metamx.common.guava.BaseSequence.accumulate(BaseSequence.java:67) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.ConcatSequence$1.accumulate(ConcatSequence.java:46) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.ConcatSequence$1.accumulate(ConcatSequence.java:42) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.MappingAccumulator.accumulate(MappingAccumulator.java:39) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.FilteringAccumulator.accumulate(FilteringAccumulator.java:40) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.MappingAccumulator.accumulate(MappingAccumulator.java:39) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.BaseSequence.accumulate(BaseSequence.java:67) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.MappedSequence.accumulate(MappedSequence.java:40) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.ResourceClosingSequence.accumulate(ResourceClosingSequence.java:38) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.FilteredSequence.accumulate(FilteredSequence.java:42) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.MappedSequence.accumulate(MappedSequence.java:40) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.ResourceClosingSequence.accumulate(ResourceClosingSequence.java:38) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.ConcatSequence.accumulate(ConcatSequence.java:40) ~[java-util-0.27.10.jar:?]
	at com.metamx.common.guava.ResourceClosingSequence.accumulate(ResourceClosingSequence.java:38) ~[java-util-0.27.10.jar:?]
	at io.druid.query.MetricsEmittingQueryRunner$1.accumulate(MetricsEmittingQueryRunner.java:104) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.query.MetricsEmittingQueryRunner$1.accumulate(MetricsEmittingQueryRunner.java:104) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.query.spec.SpecificSegmentQueryRunner$2$1.call(SpecificSegmentQueryRunner.java:87) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.query.spec.SpecificSegmentQueryRunner.doNamed(SpecificSegmentQueryRunner.java:171) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.query.spec.SpecificSegmentQueryRunner.access$400(SpecificSegmentQueryRunner.java:41) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.query.spec.SpecificSegmentQueryRunner$2.doItNamed(SpecificSegmentQueryRunner.java:162) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.query.spec.SpecificSegmentQueryRunner$2.accumulate(SpecificSegmentQueryRunner.java:80) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.query.CPUTimeMetricQueryRunner$1.accumulate(CPUTimeMetricQueryRunner.java:81) ~[druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at com.metamx.common.guava.Sequences$1.accumulate(Sequences.java:90) ~[java-util-0.27.10.jar:?]
	at io.druid.query.GroupByMergedQueryRunner$1$1.call(GroupByMergedQueryRunner.java:120) [druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at io.druid.query.GroupByMergedQueryRunner$1$1.call(GroupByMergedQueryRunner.java:111) [druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_121]
	at io.druid.query.PrioritizedListenableFutureTask.run(PrioritizedExecutorService.java:271) [druid-processing-0.9.2.2.6.1.0-129.jar:0.9.2.2.6.1.0-129]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]

Expert Contributor

@yjiang am suspecting this NPE is a side effect of running out of memory ? can you share the config of historical ? also i see that the number of allocated byte buffers is at least 77 times 1GB, does this machine has all that free ram ? How many physical cores this machine has ? usually we recommend that the number of intermediate processing buffers is number of physical core - 1 at max.

View solution in original post

Expert Contributor

@Slim Thanks for your advise. You are right. The NPE is a side effect of running out of memory. It works perfectly (response under 1s with cached data) after reducing Druid Historical buffer size.