Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Does hive support Photo or images datatypes?

Solved Go to solution

Does hive support Photo or images datatypes?

New Contributor
 
1 ACCEPTED SOLUTION

Accepted Solutions

Re: Does hive support Photo or images datatypes?

Mentor

@Christian Lunesa

Yes store images in binary format see below but retrieval is another process altogether

Create table image

beeline> ! connect  jdbc:hive2://texas.us.com:10000/default
Enter username for jdbc:hive2://texas.us.com:10000/default: hive
Enter password for jdbc:hive2://texas.us.com:10000/default: ****
Connected to: Apache Hive (version 1.2.1000.2.6.2.0-205)
Driver: Hive JDBC (version 1.2.1000.2.6.2.0-205)
Transaction isolation: TRANSACTION_REPEATABLE_READ
1: jdbc:hive2://texas.us.com:10000/default> show databases;
+----------------+--+
| database_name  |
+----------------+--+
| default        |
| geolocation    |
+----------------+--+
4 rows selected (2.397 seconds)
use geolocation;
Create table image(picture binary);
show tables;

Now to load image in it is as simple as the load data statement as:

hive> show databases;
OK
default
geolocation
Time taken: 1.955 seconds, Fetched: 4 row(s)
hive> use geolocation;
hive> load data local inpath '/tmp/photo.jpg' into table image; 

Now check the image

hive> select count(*) from image;
Query ID = geolocation_20180420094947_79e8e1fb-dfb3-40c6-949e-3fb61e8bc7d1
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1524208851011_0003)
--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 5.87 s
--------------------------------------------------------------------------------
OK
19038
Time taken: 10.114 seconds, Fetched: 1 row(s) 

A select will return gabbled output, but the is loaded.

Store images/videos into Hadoop HDFS

hdfs dfs -put /src_image_file /dst_image_file 

And if your intent is more than just storing the files, you might find HIPI useful. HIPI is a library for Hadoop's MapReduce framework that provides an API for performing image processing tasks in a distributed computing environment http://hipi.cs.virginia.edu/

http://www.tothenew.com/blog/how-to-manage-and-analyze-video-data-using-hadoop/

https://content.pivotal.io/blog/using-hadoop-mapreduce-for-distributed-video-transcoding

Hope that helps

2 REPLIES 2

Re: Does hive support Photo or images datatypes?

Contributor

You can use the BINARY data type in Hive. Store the photo/image as binary in the hive table. You may retrieve it back from the query results and display it in your frontend application.

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-MiscTypes

Re: Does hive support Photo or images datatypes?

Mentor

@Christian Lunesa

Yes store images in binary format see below but retrieval is another process altogether

Create table image

beeline> ! connect  jdbc:hive2://texas.us.com:10000/default
Enter username for jdbc:hive2://texas.us.com:10000/default: hive
Enter password for jdbc:hive2://texas.us.com:10000/default: ****
Connected to: Apache Hive (version 1.2.1000.2.6.2.0-205)
Driver: Hive JDBC (version 1.2.1000.2.6.2.0-205)
Transaction isolation: TRANSACTION_REPEATABLE_READ
1: jdbc:hive2://texas.us.com:10000/default> show databases;
+----------------+--+
| database_name  |
+----------------+--+
| default        |
| geolocation    |
+----------------+--+
4 rows selected (2.397 seconds)
use geolocation;
Create table image(picture binary);
show tables;

Now to load image in it is as simple as the load data statement as:

hive> show databases;
OK
default
geolocation
Time taken: 1.955 seconds, Fetched: 4 row(s)
hive> use geolocation;
hive> load data local inpath '/tmp/photo.jpg' into table image; 

Now check the image

hive> select count(*) from image;
Query ID = geolocation_20180420094947_79e8e1fb-dfb3-40c6-949e-3fb61e8bc7d1
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1524208851011_0003)
--------------------------------------------------------------------------------
        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED
--------------------------------------------------------------------------------
Map 1 ..........   SUCCEEDED      1          1        0        0       0       0
Reducer 2 ......   SUCCEEDED      1          1        0        0       0       0
--------------------------------------------------------------------------------
VERTICES: 02/02  [==========================>>] 100%  ELAPSED TIME: 5.87 s
--------------------------------------------------------------------------------
OK
19038
Time taken: 10.114 seconds, Fetched: 1 row(s) 

A select will return gabbled output, but the is loaded.

Store images/videos into Hadoop HDFS

hdfs dfs -put /src_image_file /dst_image_file 

And if your intent is more than just storing the files, you might find HIPI useful. HIPI is a library for Hadoop's MapReduce framework that provides an API for performing image processing tasks in a distributed computing environment http://hipi.cs.virginia.edu/

http://www.tothenew.com/blog/how-to-manage-and-analyze-video-data-using-hadoop/

https://content.pivotal.io/blog/using-hadoop-mapreduce-for-distributed-video-transcoding

Hope that helps