Created 06-11-2018 01:18 PM
Hi,
The sqoop arguments hive-import and create-hive-table, both creates the table as well as imports data. so when to use hive-import and create-hive-table argument ? Is there specific purpose for these arguments ?
Created 06-11-2018 04:32 PM
Hey @heta desai !
The create-hive-table will fail if the table exists, and as the name says it doesn't import any data.
And the hive-import does both (create DDL and import data).
By default the create-hive-table is set to false
If we look into the github apache sqoop:
https://github.com/apache/sqoop/blob/3233db8e1c481e38c538f4caaf55bcbc0c11f208/src/java/org/apache/sq...
/** * @param explicitHiveImport true if the user has an explicit --hive-import * available, or false if this is implied by the tool. * @return options governing interaction with Hive */ protected RelatedOptions getHiveOptions(boolean explicitHiveImport) { RelatedOptions hiveOpts = new RelatedOptions("Hive arguments"); if (explicitHiveImport) { hiveOpts.addOption(OptionBuilder .withDescription("Import tables into Hive " + "(Uses Hive's default delimiters if none are set.)") .withLongOpt(HIVE_IMPORT_ARG) .create()); } hiveOpts.addOption(OptionBuilder.withArgName("dir") .hasArg().withDescription("Override $HIVE_HOME") .withLongOpt(HIVE_HOME_ARG) .create()); hiveOpts.addOption(OptionBuilder .withDescription("Overwrite existing data in the Hive table") .withLongOpt(HIVE_OVERWRITE_ARG) .create()); hiveOpts.addOption(OptionBuilder .withDescription("Fail if the target hive table exists") .withLongOpt(CREATE_HIVE_TABLE_ARG) .create()); hiveOpts.addOption(OptionBuilder.withArgName("table-name") .hasArg() .withDescription("Sets the table name to use when importing to hive") .withLongOpt(HIVE_TABLE_ARG) .create()); hiveOpts.addOption(OptionBuilder.withArgName("database-name") .hasArg() .withDescription("Sets the database name to use when importing to hive") .withLongOpt(HIVE_DATABASE_ARG) .create()); hiveOpts.addOption(OptionBuilder .withDescription("Drop Hive record \\0x01 and row delimiters " + "(\\n\\r) from imported string fields") .withLongOpt(HIVE_DROP_DELIMS_ARG) .create()); hiveOpts.addOption(OptionBuilder .hasArg() .withDescription("Replace Hive record \\0x01 and row delimiters " + "(\\n\\r) from imported string fields with user-defined string") .withLongOpt(HIVE_DELIMS_REPLACEMENT_ARG) .create()); hiveOpts.addOption(OptionBuilder.withArgName("partition-key") .hasArg() .withDescription("Sets the partition key to use when importing to hive") .withLongOpt(HIVE_PARTITION_KEY_ARG) .create()); hiveOpts.addOption(OptionBuilder.withArgName("partition-value") .hasArg() .withDescription("Sets the partition value to use when importing " + "to hive") .withLongOpt(HIVE_PARTITION_VALUE_ARG) .create()); hiveOpts.addOption(OptionBuilder.withArgName("hdfs path") .hasArg() .withDescription("Sets where the external table is in HDFS") .withLongOpt(HIVE_EXTERNAL_TABLE_LOCATION_ARG) .create()); hiveOpts.addOption(OptionBuilder .hasArg() .withDescription("Override mapping for specific column to hive" + " types.") .withLongOpt(MAP_COLUMN_HIVE) .create()); hiveOpts.addOption(OptionBuilder .hasArg() .withDescription("The URL to the HiveServer2.") .withLongOpt(HS2_URL_ARG) .create()); hiveOpts.addOption(OptionBuilder .hasArg() .withDescription("The user/principal for HiveServer2.") .withLongOpt(HS2_USER_ARG) .create()); hiveOpts.addOption(OptionBuilder .hasArg() .withDescription("The location of the keytab of the HiveServer2 user.") .withLongOpt(HS2_KEYTAB_ARG) .create()); return hiveOpts; } return hiveOpts; }
or in the sqoop user guide:
https://sqoop.apache.org/docs/1.4.2/SqoopUserGuide.html#_importing_data_into_hive
PS: I usually use more the hive-import parameter in my sqoop jobs 🙂
Hope this helps!
Created 06-12-2018 06:02 AM
So we can use create-hive-table and hive-import as an alternatives right ?
Created 06-12-2018 02:23 PM
For creating tables yes, but to import data it's hive-import.
Hope this helps! 🙂