Support Questions
Find answers, ask questions, and share your expertise

Sqoop Import - Now what

Explorer

Hi,

I wondered if someone could point me in the right direction.

I've imported some data (4 rows) into HDFS via Sqoop using the command below;

sqoop import --connect "jdbc:sqlserver://ipaddress:port;database=dbname;user=username;password=userpassword" --table policy --target-dir "/user/maria_dev/data/SQLImport"

This worked correctly and gave me 5 files;

part-m-00000,

part-m-00001,

part-m-00002,

part-m-00003

and _Success

Given there are 4 rows in my table and 4 part files I assume each is a row plus the success file.

Where I need some help is with;

1. Understanding what these are? Are they AVRO files?

2. How can I create a Hive 'table' over the top of these, like I would by using the upload button in Ambari?

Making accessible to Hive Querying? Any help or pointers would be massively appreciated.

Thanks,

Nic

1 ACCEPTED SOLUTION

Accepted Solutions

Rising Star

@Nic Hopper You can directly import table to hive, with --hive-import

sqoop import --connect "jdbc:sqlserver://ipaddress:port;database=dbname;user=username;password=userpassword" --table policy --warehouse-dir "/user/maria_dev/data/SQLImport" --hive-import --hive-overwrite

It creates the hive table and writes data into it(generally managed table finally moves data to hive.warehouse.dir)

View solution in original post

4 REPLIES 4

Rising Star

Contributor

@Nic Hopper

like @icocio points out you can simply use sqoop to fetch the data and write to a Hive table directly.

Rising Star

@Nic Hopper You can directly import table to hive, with --hive-import

sqoop import --connect "jdbc:sqlserver://ipaddress:port;database=dbname;user=username;password=userpassword" --table policy --warehouse-dir "/user/maria_dev/data/SQLImport" --hive-import --hive-overwrite

It creates the hive table and writes data into it(generally managed table finally moves data to hive.warehouse.dir)

View solution in original post

Explorer

Hi,

Thank you all for the responses. Works as expected. I do have another question though, more advice than anything.

So I can now import data from SQL Server to Hive but if I want to apply business logic to my data how do you think I'm best doing this. Only something simple for now I think.

Shall I do it;

1. In the import, so query the data rather than a table in my import and have some logic there.

2. Import it to HIVE as I have done and then do something there.

3. Do something else.

Any pointers would be appreciated.

Thanks,

Nic.