Support Questions
Find answers, ask questions, and share your expertise

“Invalid type marker byte 0x3c” when accessing from Hive

Contributor

Hello

I’m having Hortonworks HDP 2.6 installed together with the druid version that ships with it. I have installed Druid on a Kerberos secured cluster and are having problem accessing Druid from Hive. I can create a Druid datasource from hive using a normal “create table” statement, but I cant do a select on it. Once the datasource is created in Druid, I can do a normal json rest query directly to druid and I get the expected result back. So it feels like the Druid part is working as it should.

When I query the data from Hive, I get two different output saying the same thing.

Error: java.io.IOException: org.apache.hive.druid.com.fasterxml.jackson.core.JsonParseException: Invalid type marker byte 0x3c for expected value token at [Source: org.apache.hive.druid.com.metamx.http.client.io.AppendableByteArrayInputStream@245e6e5b; line: -1, column: 1] (state=,code=0)

or

Error: java.io.IOException: java.io.IOException: org.apache.hive.druid.com.fasterxml.jackson.core.JsonParseException: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')

at [Source: org.apache.hive.druid.com.metamx.http.client.io.AppendableByteArrayInputStream@6dd4a729; line: 1, column: 2]

at org.apache.hive.druid.com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1576)

at org.apache.hive.druid.com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:533)

at org.apache.hive.druid.com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:462)

at org.apache.hive.druid.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleUnexpectedValue(UTF8StreamJsonParser.java:2610)

at org.apache.hive.druid.com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:841)

at org.apache.hive.druid.com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:737)

at org.apache.hive.druid.com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3090)

at org.apache.hive.druid.com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3036)

at org.apache.hive.druid.com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2199)

at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.distributeSelectQuery(DruidQueryBasedInputFormat.java:227)

at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getInputSplits(DruidQueryBasedInputFormat.java:160)

at org.apache.hadoop.hive.druid.io.DruidQueryBasedInputFormat.getSplits(DruidQueryBasedInputFormat.java:104)

at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextSplits(FetchOperator.java:372)

at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:304)

at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:459)

at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:428)

at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:146)

at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1932)

at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:482)

at org.apache.hive.service.cli.operation.OperationManager.getOperationNextRowSet(OperationManager.java:311)

at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:856)

at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:552)

at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:715)

at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1717)

at org.apache.hive.service.rpc.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1702)

at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)

at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)

at org.apache.thrift.server.TServlet.doPost(TServlet.java:83)

at org.apache.hive.service.cli.thrift.ThriftHttpServlet.doPost(ThriftHttpServlet.java:206)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:755)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)

at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:565)

at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:479)

at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:225)

at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1031)

at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406)

at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186)

at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:965)

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)

at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111)

at org.eclipse.jetty.server.Server.handle(Server.java:349)

at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:449)

at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:925)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857)

at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)

at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:76)

at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:609)

at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:45)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745) (state=,code=0)

Does anybody have any idea about whats wrong here?

4 REPLIES 4

Contributor

An update!

I did a network package debug to see what druid is returning to Hive, and I actually get a authentication error (see output further down). Most likely this is due to Kerberos configured in the cluster. Strange that I can create the datasource from Hive without getting this error. Now I’m hoping that it’s just a configuration issue about setting the correct Hive settings to be able to connect to a Kerberos enabled Druid installation. If anybody got a hint, please let me know

<html>

<head>

<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>

<title>Error 401 </title>

</head>

<body>

<h2>HTTP ERROR: 401</h2>

<p>Problem accessing /druid/v2/. Reason:

<pre> Authentication required</pre></p>

<hr /><i><small>Powered by Jetty://</small></i>

</body>

</html>

Expert Contributor

Hi seems like your druid instance is a secured cluster, unfortunately the hive druid integration does not support Kerberos yet, it is coming soon.

Expert Contributor

To make sue i am getting this how do you query the data via rest call ? are you including your kerberos credentials ?

Contributor

Yes. When I make the REST call with curl, I include the Kerberos credentials and it work fine then.

To be able to go forward with the rest of the tests, I made a really REALLY ugly solution (not in production, just test environment). I added /druid to druid.hadoop.security.spnego.excludedPaths so the query works from Hive without hitting the unsupported Kerberos problem. Nothing I can recommend to do, but it gives us the possibility test the rest of the solution until the Kerberos problem is fixed.

; ;