About goudhadoop

Ciro_Fuccio · ‎04-14-2020

you can use .repartition(1) DF..repartition(1) .....

cskrabak · ‎02-08-2017

As for >2 GB blobs, Hive STRING or even BINARY won't handle AFAIK. But that is just googled, Hive experts please add your thoughts. Please note that the "InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit." part in your stack trace tells you that you hit the limits of ProtocolBuffers, not Hive field type limitations. That could explain the 500 MB limit that you got in your investigations. In Hive code, orc input stream implementation I could see that there is 1 GB protobuf limit set but that is for the whole message and the blob is only a part of it.

Online	Offline
Last Visited	‎01-08-2020 05:59 PM

Member Since	‎10-13-2016 12:50 AM
Last Visited	‎01-08-2020 05:59 PM
Posts	9
Kudos received	2

Cloudera Community

Re: How to convert spark dataframes into xml files...

Re: Clobs and blobs not able to access in hive mor...