Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sqoop Doubt

Solved Go to solution

Sqoop Doubt

Explorer

Hi ,

 

 

Suppose if We have a data in MySQL db in STUDENT table ,  we have imported the data in HDFS. there is no predefined Hive table. Data is tab delimited text files.How the data can be make available to Hive Table.(Assume that not delimited ).

 

Please help me ...Iam confused with sqoop commands, or is it possible to do with creat external table using hive -e" create external table....", or do we have to use sqoop hive import commnad??

 

 

Waiting for the reply..

Please help me

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Sqoop Doubt

Explorer
Okkk thanks Harish... :)
10 REPLIES 10

Re: Sqoop Doubt

Master Guru
You have a couple of options at least:

0. Import from Sqoop directly into the required file format and/or table,
instead of just delimited text. Sqoop supports Text, Sequence, Avro and
Parquet formats.

2. Import into a new (temporary) table thats created with the delimited
format specifiers. Load the data into this table via LOAD DATA LOCAL INPATH
statement, then use INSERT INTO statement to move the data into the
original table that uses a different file format.

Re: Sqoop Doubt

Explorer
Hi Harish,

Thanks for the reply & your valuable time spent for my question..

Actually what I got confused is ..if the data is already in HDFS, then what is the need of sqoop?? We can create external table & make that data available .right?? Please correct me if i am wrong.

Re: Sqoop Doubt

Master Guru
Sqoop should be used if you have to import or export data from/to RDBMSs
like MySQL/Oracle/etc.; It will allow you to grab data from there and load
it into HDFS.

If you already have your data in HDFS, then Sqoop may not be what you are
looking for (unless you want to write data back to RDBMSs).

Re: Sqoop Doubt

Explorer
Hi Harish,

So right answer will be the creating external table right?


Thanks
Priyap

Re: Sqoop Doubt

Master Guru
That sounds about right!

Re: Sqoop Doubt

Explorer
Ok Thanks Harish for your valuable help.

I have few more doubts.

I have been searching for the difference between 2 sqoop commnads, Please anyone tell me when we will use each commands, what is the significance of them , Under what situation we will use?


1) sqoop-import --connect jdbc:mysql://localhost:3306/db1 -username root -password password --table tableName --hive-table tableName --create-hive-table --hive-import --hive-home path/to/hive_home

2) sqoop import --connect <JDBC connection string> --table <tablename> --username <username> --password <password> --hive-import


What is the significance of using --create-hive-table in 1st command. Even if its not there it will import the data in to hive using"--hive-import"coamnd?. Then what is the use of that? In second command we are doing the same, but no --create-hive-table. why?.

Pelase help me..


Thanks a lot..

Re: Sqoop Doubt

Master Guru
Checkout the Sqoop User Guide documentation at
http://archive.cloudera.com/cdh5/cdh/5/sqoop/SqoopUserGuide.html:

| --create-hive-table | If set, then the job will fail if the target hive
table exits. By default this property is false. |

Re: Sqoop Doubt

Explorer
Thanks Harish.

I have seen the same in the site..

Its telling it is used to create hive table, if there is existing table job will fail.Right?

So my doubt is if we don't use --create-hove-table in the 1st command, it should create a hive table right using "--hive-import"? So what is the significance of using --create-hive-table command??

Please correct if I am wrong.


Thanks for your help Harish..

Re: Sqoop Doubt

Master Guru
The option is useful to ensure/assert you don't already have an existing
table and the data should not be appended into an existing table.

This is required in some workflows that load staging tables, for example