Support Questions

Find answers, ask questions, and share your expertise

Sqoop Import - Now what

avatar
New Member

Hi,

I wondered if someone could point me in the right direction.

I've imported some data (4 rows) into HDFS via Sqoop using the command below;

sqoop import --connect "jdbc:sqlserver://ipaddress:port;database=dbname;user=username;password=userpassword" --table policy --target-dir "/user/maria_dev/data/SQLImport"

This worked correctly and gave me 5 files;

part-m-00000,

part-m-00001,

part-m-00002,

part-m-00003

and _Success

Given there are 4 rows in my table and 4 part files I assume each is a row plus the success file.

Where I need some help is with;

1. Understanding what these are? Are they AVRO files?

2. How can I create a Hive 'table' over the top of these, like I would by using the upload button in Ambari?

Making accessible to Hive Querying? Any help or pointers would be massively appreciated.

Thanks,

Nic

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Nic Hopper You can directly import table to hive, with --hive-import

sqoop import --connect "jdbc:sqlserver://ipaddress:port;database=dbname;user=username;password=userpassword" --table policy --warehouse-dir "/user/maria_dev/data/SQLImport" --hive-import --hive-overwrite

It creates the hive table and writes data into it(generally managed table finally moves data to hive.warehouse.dir)

View solution in original post

4 REPLIES 4

avatar
Expert Contributor

avatar
Rising Star

@Nic Hopper

like @icocio points out you can simply use sqoop to fetch the data and write to a Hive table directly.

avatar
Expert Contributor

@Nic Hopper You can directly import table to hive, with --hive-import

sqoop import --connect "jdbc:sqlserver://ipaddress:port;database=dbname;user=username;password=userpassword" --table policy --warehouse-dir "/user/maria_dev/data/SQLImport" --hive-import --hive-overwrite

It creates the hive table and writes data into it(generally managed table finally moves data to hive.warehouse.dir)

avatar
New Member

Hi,

Thank you all for the responses. Works as expected. I do have another question though, more advice than anything.

So I can now import data from SQL Server to Hive but if I want to apply business logic to my data how do you think I'm best doing this. Only something simple for now I think.

Shall I do it;

1. In the import, so query the data rather than a table in my import and have some logic there.

2. Import it to HIVE as I have done and then do something there.

3. Do something else.

Any pointers would be appreciated.

Thanks,

Nic.