Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to store PDF/images in Hive table?

How to store PDF/images in Hive table?

New Contributor

Hi,

In my use case, I need to store one or more documents(pdf or images) in hive table corresponding to id.

I have tried to find a way but could not find a precise solution.

 

Can somebody please let me know what will be the DDL and DML statements for the above purpose?

 

Many thanks in advance.

2 REPLIES 2

Re: How to store PDF/images in Hive table?

Champion

if you just want to load the image in the hive table not process fruther  use binary data type , but to view or process you need to have like some java code to convert that into binary data before you load in to hive table. 

 

Something like 

Create table imageTable(id int , picture binary);
Load statement - point to the image location ..
could be either managed or external table

Re: How to store PDF/images in Hive table?

New Contributor

The solution provided above here has been working to me always for something about a year so far, you should do so that way. The other issue that had been happening to me from time to time is the files I tried to store there became corrupted for some reason, so I got them fixed only manually with this editor https://edit-pdf.pdffiller.com/ Every paid editor actually will fit well for such purposes, but others just way more expensive actually. Back to storing, I don't know have they fixed that issue so far or not, but check all the files after the export