Community Articles

Find and share helpful community-sourced technical articles.
Labels (2)
avatar
New Contributor

To duplicate table and merge partitions from TWO PARTITIONS (date VAR /string VAR) to ONE PARTITION (STRING var): 

  1. 1 STEP: create new table with just one partition variable:
    -------CREATE TABLE `caz_bb96_cdnbank.jg4_CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT_20220727`
    CREATE TABLE ${hivevar:CON_DB}.CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT2
    (
    `acct_num` string,
    `app_num` int,
    `host_nm` string,
    ---.....
    `payment_extension` string,
    `final_ltv` string,
    `cbb_used_vehcl_val_rrough_amt` double, `eff_dt` date)
    PARTITIONED BY (`acct_tp` string)
  2. Populate having conditions for PARTITION var FIRST and then for the other PARTITION variable to avoid shifted variables

    E.g.:
    set hive.support.quoted.identifiers=none;
    set hive.exec.dynamic.partition.mode=nonstrict;
    INSERT OVERWRITE table `caz_bb96_cdnbank.jg4_CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT_20220730` partition (acct_tp)
    select
    acct_num
    ,app_num
    --.....
    FROM edaf.CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT cll
    WHERE eff_dt IN
    (SELECT MAX(eff_dt) FROM edaf.CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT where eff_dt<=date_add(current_date(),120));

    E.g. 2: (not better for repeating overwrite):
    set hive.support.quoted.identifiers=none;
    set hive.exec.dynamic.partition.mode=nonstrict;
    INSERT OVERWRITE TABLE caz_bb96_cdnbank.jg4_CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT_20220727 PARTITION (acct_tp='PRIME LOAN')
    SELECT `(acct_tp)?+.+` FROM caz_bb96_cdnbank.jg_2CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT_20220727 WHERE acct_tp='PRIME LOAN' AND
    eff_dt=cast(last_day(add_months(current_date, -1)) as date);

    INSERT OVERWRITE TABLE caz_bb96_cdnbank.jg4_CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT_20220727 PARTITION (acct_tp='CDLSI LEASE')
    SELECT `(acct_tp)?+.+` FROM caz_bb96_cdnbank.jg_2CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT_20220727 WHERE acct_tp='CDLSI LEASE' AND
    eff_dt=cast(last_day(add_months(current_date, -1)) as date);

    INSERT OVERWRITE TABLE caz_bb96_cdnbank.jg4_CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT_20220727 PARTITION (acct_tp='SDA LOAN')
    SELECT `(acct_tp)?+.+` FROM caz_bb96_cdnbank.jg_2CONSOL_LOAN_LSE_ACCT_MTH_SNAPSHOT_20220727 WHERE acct_tp='SDA LOAN' AND
    eff_dt=cast(last_day(add_months(current_date, -1)) as date);

--IF YOU NOT DID THE INSTRUCTIONS:

--WOULD GET THESE Commons errors:
--Error: Error while compiling statement: FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target table because column number/types are different 'acct_tp': Table insclause-0 has 371 columns, but query has 372 columns. (state=42000,code=10044)

Disclaimer: This article is contributed by an external user. The steps may not be verified by Cloudera and may not be applicable for all use cases and may be very specific to a particular distribution. Please follow with caution and at your own risk. If needed, raise a support case to get confirmation.

1,015 Views
0 Kudos