Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to import data from MongoDB to Hive or Hbase ?

Solved Go to solution

How to import data from MongoDB to Hive or Hbase ?

New Contributor

Hi All,

I would like to know how I can import data from MongoDB (documents) to Hive or Hbase ?

Best Regards

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to import data from MongoDB to Hive or Hbase ?

@Hamza FRIOUA

Best option would be using Mongo hadoop connector with hive external tables but you need to built that jar manually or use prebuilt.

https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage

CREATE TABLE individuals
( 
  id INT,
  name STRING,
  age INT,
  work STRUCT<title:STRING, hours:INT>
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id","work.title":"job.position"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/test.persons');
7 REPLIES 7

Re: How to import data from MongoDB to Hive or Hbase ?

@Hamza FRIOUA

Best option would be using Mongo hadoop connector with hive external tables but you need to built that jar manually or use prebuilt.

https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage

CREATE TABLE individuals
( 
  id INT,
  name STRING,
  age INT,
  work STRUCT<title:STRING, hours:INT>
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'='{"id":"_id","work.title":"job.position"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/test.persons');

Re: How to import data from MongoDB to Hive or Hbase ?

New Contributor

I tried using the external table method but I run out of memory. My mongo collection (table2) has 10 million records (0.755 GB) and reading from it works. After the insert task fails I do a count on the native table (table1) and it contains 0 rows.

My query looks like this: "INSERT INTO table1 SELECT * FROM table2", if I add "LIMIT 1000" it works, however I need to migrate the entire collection. I attached the output from beeline.

Re: How to import data from MongoDB to Hive or Hbase ?

@Hamza FRIOUA I wrote this awhile back for a customer. The version may have changed but it should still be relevant. Essentially, it creates a test MongoDB instance, loads data, installs the storagehandler, creates a Hive table.

1. Install MongoDB: sudo yum install mongodb-org You may need to setup the following mongodb.repo file in /etc/yum.repos.d [mongodb] name=MongoDB Repository baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/ gpgcheck=0 enabled=1

2. Start mongodb: sudo service mongod start

3. Enter the mongo CLI by typing mongo

4. http://docs.mongodb.org/manual/tutorial/generate-test-data/ Type the following to add test data to db.testData. MongoDB will implicitly create the database if it isn’t already created. The default is “25” records but this can be increased if needed: for (var i = 1; i <= 25; i++) {db.testData.insert( { x : i } )}

5. To display the data type: db.testData.find()

6. http://docs.mongodb.org/ecosystem/tutorial/getting-started-with-hadoop/

7. From /root, download the mongo-hadoop git repo: git clone https://github.com/mongodb/mongo-hadoop.git

8. Navigate to /root/mongo-hadoop and type ./gradlew jar

9. Place .jar files in usr\lib\hadoop\lib and usr\lib\hive\lb mongo-hadoop-core-1.4.0-SNAPSHOT.jar mongo-hadoop-hive-1.4.0-SNAPSHOT.jar mongo-hadoop-pig-1.4.0-SNAPSHOT.jar

10. Type hive on the command line to start the Hive shell

****Create Hive Table*****

CREATE EXTERNAL TABLE testdb ( id STRING, x INT )

STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'

WITH SERDEPROPERTIES('mongo.columns.mapping' = '{"id":"_id", "x":"x"}') TBLPROPERTIES('mongo.uri'='mongodb://127.0.0.1:27017/db.testData');

***********WARNING: If you leave out the EXTERNAL command, Hive will use the MongoDB collection as the primary source. Dropping the Hive table will remove the collection from Mongo. ***********

11. You should now be able to see your MongoDB data by typing “SELECT * FROM testdb;"

Hope it helps!

Re: How to import data from MongoDB to Hive or Hbase ?

New Contributor

@Scott Shaw

I tried your example but I don't find the table in hdfs :

http://localhost:50070/explorer.html#/user/hive/warehouse/testdb

even I removed external ...???

,

@Scott Shaw I tried your example but I don't find the Table testdb in hdfs . Even when I removed external...???

,

@Scot Shaw

I tested your example but I did not find any results in hdfs (

http://localhost:50070/explorer.html#/user/hive/warehouse/testdb

)?? Even when i removed External

Highlighted

Re: How to import data from MongoDB to Hive or Hbase ?

Mentor

@HENI MAHER please open this as a new question and describe your problem in full.

Re: How to import data from MongoDB to Hive or Hbase ?

Mentor

Re: How to import data from MongoDB to Hive or Hbase ?

New Contributor

my question is relatd with Scott answer

Don't have an account?
Coming from Hortonworks? Activate your account here