Created on 08-09-2018 06:57 PM
Sqoop is used for importing data from multiple sources onto HDFS. One of the most common use can is to use Hive imports
sqoop import --connect jdbc:mysql://localhost:3306/sqoop --username root -P --split-by id --columns id,name --table customer --target-dir /user/cloudera/ingest/raw/customers --fields-terminated-by "," --hive-import --create-hive-table --hive-table sqoop_workspace.customers
If you want to specify a specific Queue for sqoop job -Dmapred.job.queuename=queueName needs to be immediately added after the import keyword.
sqoop import -Dmapred.job.queuename=queueName --connect jdbc:mysql://localhost:3306/sqoop --username root -P --split-by id --columns id,name --table customer --target-dir /user/cloudera/ingest/raw/customers --fields-terminated-by "," --hive-import --create-hive-table --hive-table sqoop_workspace.customers
This will launch the sqoop job in the specific queu but the hive job will be launched in the default queue. To launch the hive job is specific queue
make a copy of tez-site.xml and in the queue name add the queue you want the hive job to be executed.
Property of tez-site.xml
<property>
<name>tez.queue.name</name> <value>custom Queue Name </value>
</property>
export HIVE_CONF_DIR=PATH OF DIR WHERE CUSTOM tez-site.xml
is placed
run the sqoop job with the export statement executed.Do remember to add -Dmapred.job.queuename=queueName (immediately after import) to set the sqoop queue name and tez-site.xml for hive queue name