Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hive fails with IOException: Error writing to server on really long query

avatar
Expert Contributor

Using HDP 2.3.2 sandbox. This is the second error I got trying to get this working. The first error and "solution" can be found here.

https://community.hortonworks.com/questions/18007/hive-fails-with-hive-internal-error-message-full-h...

I have an external table defined over a folder that contains XML documents. There is 1 column in this table with the column containing each documents data as a string.

I am trying to create a view on top of the XML data with xpaths. So for example,

CREATE VIEW myview (column1,...Column N) AS SELECT xpath_string(rawxml, '/my/xpath/to/value'), xpath_string(rawxml, '/another/xpath') FROM myxmltable;

The XML document has 400+ xpaths that I want to grab and put into the view. I can do about 60 columns worth of xpaths before I get this error.

FAILED: Hive Internal Error: com.sun.jersey.api.client.ClientHandlerException(java.io.IOException: java.io.IOException: Error writing to server)
com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: java.io.IOException: Error writing to server
  at com.sun.jersey.client.urlconnection.URLConnectionClientHandler.handle(URLConnectionClientHandler.java:149)
  at com.sun.jersey.api.client.Client.handle(Client.java:648)

My cursory research seems to indicate that the query string is too long and is breaking something. I am writing these queries on the hiveCLI so not sure how else I can fix this. I also tried using beeline and get the same error.

1 ACCEPTED SOLUTION

avatar
Master Mentor
8 REPLIES 8

avatar
Master Mentor

Can you write it as a script and execute script instead of as one query? You can pass script to beeline or hive shell or do source command within CLI. @kevin vasko

avatar
Expert Contributor

I have not tried. I'll try it and see.

avatar
Master Mentor

avatar
Expert Contributor

Thanks but same issue. How can I increase the value of how long of string Hive can take as a query?

I created a SimpleUDF that takes an input of the XML string and does all the xpath parsing on that file and returns a map type. I was hoping that getting rid of all the xpath calls would eliminate the issue but didn't work. I can now do SELECT m["key"] FROM (SELECT myfunc(xmldata) FROM xmlSource). But when I do SELECT m["key1"]....m[key400" FROM ...(...) I'm back at the "full HEAD" issue for some reason.

avatar
Master Mentor

avatar
Expert Contributor

since there isn't really any hard limit, and 400 columns I shouldn't be enough to cause oom memory issues I'm not quite sure on what else to do. This issue to me purely looks like configuration/bugs in Hive or its dependencies. I posted this issue on the user mailing list but I haven't heard anything. Any suggestions?

avatar
Master Mentor

@Kevin Vasko I need more information on this.

Please share more information from logs

com.sun.jersey.api.client.ClientHandlerException: java.io.IOException: java.io.IOException:Error writing to server

avatar
Expert Contributor

@Neeraj Sabharwal There are two errors I've been fighting with on getting access to all of these columns in the same query. The second one I *thought* I had a work around for by disabling security (unchecking the security box in Ambari for Hive) but it keeps showing back up. Here is the defect which I think I'm running into for the FULL head issue.

UPDATE: I'm about 99.99% sure I figured out the problem! I started looking further into the ERROR logs. This line here "at org.apache.atlas.security.SecureClientUtils$1$1.run(SecureClientUtils.java:103)" tipped me off that in some way ATLAS was being interacted with. I disabled ATLAS by turning off the atlas service and removing

hive.exec.failure.hooks=org.apache.atlas.hive.hook.HiveHook

I ran my entire query and it worked without issue! I would venture to say that this is an issue with ATLAS not being able to handle really long queries.

https://issues.apache.org/jira/browse/HIVE-11720

1. Error writing to server

https://gist.github.com/kur1j/513e5a1499eef6c727a1

2. FULL head

https://gist.github.com/kur1j/217eae2065c7953d9cf7