Created 04-15-2015 09:24 AM
Using beeswax editor, when a hive query result contains a value longer than 64K in a string field, that value is truncated. Only the last n characters of the string are included in the result column, where n = resultlength mod 65536. Interestingly, the column includes enough blank space for the missing characters, so it is necessary to scroll far to the right in order to see the last portion of the string that is included in the result.
This does not occur when running the same query from the hive console, and when run in hue, any string functions against that field in the query are run against the full length of the string. The only issue seems to be with the displayed result. It seems as though a 64K buffer is repeatedly filled and purged until the end of the string is reached, then whatever is left in the buffer is written to the hue result table.
Is there a configuration setting that can be changed to resolve this?
Running Hue 3.6.0 with hive-common-0.12.0-cdh5.1.3.jar
Created 04-20-2015 12:49 AM
Created 04-21-2015 01:06 PM
We don't use beeline, but we don't see this issue when using the 'hive' command, either. The only place we see it is in Hue. It seems that the issue is with Hue, not with Hive.
Created 04-22-2015 06:51 AM
Created 04-23-2015 08:37 AM
Romain,
Took a little work to figure out how to get beeline working in our environment, but once I did, I was able to determine that this issue does not occur with beeline.
Here's some more detail:
I was able to run a query against a logfile that returned a text field containing 303,635 characters. The resulting .csv file contained all 303,635 characters. Then I ran the same query using hive and got the same result (the only difference being that the beeline result enclosed the text in single quotes where hive did not, which I believe is expected since we are running 0.12.0 and thus do not have the csv2 outputformat available). Finally, I ran the same query in the Hue beeswax editor and the resulting value contained only the last 41,491 characters of the text field. In the Hue UI, there was blank space displayed equal to the space the first 262,144 would have taken up, i.e. I had to horizontally scroll past lots of blank space before I could see the last 41,491 characters. The scroll button was ~80% of the way across the scroll bar before any non-blank characters appeared, which is roughly the percentage of the text field that was not displayed.
Note: 303,635 mod 65,536 = 41,491.
Here are the commands I used:
beeline:
beeline -u <url> -n <user> -p <password> -d org.apache.hive.jdbc.HiveDriver -e "select text from logs where dt='20150413' and eventguid='I0ace7c730000014cb239bdd3e9f81e7a'" --outputformat=csv > beelinetest
hive:
hive -e "select text from logs where dt='20150413' and eventguid='I0ace7c730000014cb239bdd3e9f81e7a'" > hivetest
Hue beeswax editor:
select text from logs where dt='20150413' and eventguid='I0ace7c730000014cb239bdd3e9f81e7a'
Created 04-24-2015 12:15 AM