Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (2)
avatar

Sqoop is used for importing data from multiple sources onto HDFS. One of the most common use can is to use Hive imports

sqoop import --connect jdbc:mysql://localhost:3306/sqoop

--username root
-P
--split-by id
--columns id,name
--table customer
--target-dir /user/cloudera/ingest/raw/customers
--fields-terminated-by ","
--hive-import
--create-hive-table
--hive-table sqoop_workspace.customers

If you want to specify a specific Queue for sqoop job -Dmapred.job.queuename=queueName needs to be immediately added after the import keyword.

sqoop import -Dmapred.job.queuename=queueName
--connect jdbc:mysql://localhost:3306/sqoop
--username root
-P
--split-by id
--columns id,name
--table customer
--target-dir /user/cloudera/ingest/raw/customers
--fields-terminated-by ","
--hive-import
--create-hive-table
--hive-table sqoop_workspace.customers

This will launch the sqoop job in the specific queu but the hive job will be launched in the default queue.



To launch the hive job is specific queue

make a copy of tez-site.xml and in the queue name add the queue you want the hive job to be executed.

Property of tez-site.xml

<property>

<name>tez.queue.name</name>
<value>custom Queue Name </value>

</property>

export HIVE_CONF_DIR=PATH OF DIR WHERE CUSTOM tez-site.xml

is placed

run the sqoop job with the export statement executed.


Do remember to add -Dmapred.job.queuename=queueName (immediately after import) to set the sqoop queue name and tez-site.xml for hive queue name

1,709 Views