Reply
New Contributor
Posts: 2
Registered: ‎04-26-2018

How to store PDF/images in Hive table?

Hi,

In my use case, I need to store one or more documents(pdf or images) in hive table corresponding to id.

I have tried to find a way but could not find a precise solution.

 

Can somebody please let me know what will be the DDL and DML statements for the above purpose?

 

Many thanks in advance.

Highlighted
Champion
Posts: 735
Registered: ‎05-16-2016

Re: How to store PDF/images in Hive table?

if you just want to load the image in the hive table not process fruther  use binary data type , but to view or process you need to have like some java code to convert that into binary data before you load in to hive table. 

 

Something like 

Create table imageTable(id int , picture binary);
Load statement - point to the image location ..
could be either managed or external table
New Contributor
Posts: 1
Registered: ‎07-04-2018

Re: How to store PDF/images in Hive table?

The solution provided above here has been working to me always for something about a year so far, you should do so that way. The other issue that had been happening to me from time to time is the files I tried to store there became corrupted for some reason, so I got them fixed only manually with this editor https://edit-pdf.pdffiller.com/ Every paid editor actually will fit well for such purposes, but others just way more expensive actually. Back to storing, I don't know have they fixed that issue so far or not, but check all the files after the export

Announcements