Created 06-08-2016 07:27 PM
hi:
can i import with sqoop to hive with append?? i am receiving this error, also 2.8GB it took 2 hour and 35 minutes its normal??
16/06/08 21:23:40 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc. 16/06/08 21:23:40 DEBUG sqoop.Sqoop: Append mode for hive imports is not yet supported. Please remove the parameter --append-mode Append mode for hive imports is not yet supported. Please remove the parameter --append-mode at org.apache.sqoop.tool.BaseSqoopTool.validateHiveOptions(BaseSqoopTool.java:1410) at org.apache.sqoop.tool.ImportTool.validateOptions(ImportTool.java:1130) at org.apache.sqoop.Sqoop.run(Sqoop.java:138) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235) at org.apache.sqoop.Sqoop.main(Sqoop.java:244) Append mode for hive imports is not yet supported. Please remove the parameter --append-mode
and mi script is like this
sqoop import -D oraoop.disabled=true --verbose \ --connect jdbc:oracle:thin:@hostname:2521/CIP_BATCH \ --username=U029550 \ --password=Mayo2016 \ --hive-import \ --hive-table desercion_clientes_2 \ --hive-overwrite \ --query "select ID_INTERNO_PE,MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H from RDWC01.MI_CLTE_ECO_GEN where \$CONDITIONS AND COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD')" \ --boundary-query "select min(ID_INTERNO_PE), max(ID_INTERNO_PE) from RDWC01.MI_CLTE_ECO_GEN" \ --incremental append \ --check-column MI_FECHA_FIN_MES \ --last-value $LAST_ROW \ --num-mappers 5 \ --split-by ID_INTERNO_PE \ --direct \ --fetch-size 10000 \ --target-dir /RSI/staging/tmp/desercion_clientes_2
Created 06-08-2016 08:03 PM
As the error message indicates, you can't append to a Hive table yet. The "--incremental append" can be used for HDFS based output tied to a temporary Hive table. You can then insert into your main Hive table from the temporary table. Here is a blog article about the general approach: http://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/
As for the performance, I would suggest that you look at Oracle while your query is running to determine if there are any bottlenecks there. I've been able to use Sqoop to pull hundreds of millions of records from Oracle in under a couple of hours easily. I've always found the bottleneck is on the Oracle side. You can try tweaking the --num-mappers value and --fetch-size to see if that improves your speeds.
Created 06-08-2016 07:53 PM
here is the innk for more info on your issue.
Created 06-08-2016 08:07 PM
This approach definitely works and is much easier to follow.
Created 06-08-2016 08:03 PM
As the error message indicates, you can't append to a Hive table yet. The "--incremental append" can be used for HDFS based output tied to a temporary Hive table. You can then insert into your main Hive table from the temporary table. Here is a blog article about the general approach: http://hortonworks.com/blog/four-step-strategy-incremental-updates-hive/
As for the performance, I would suggest that you look at Oracle while your query is running to determine if there are any bottlenecks there. I've been able to use Sqoop to pull hundreds of millions of records from Oracle in under a couple of hours easily. I've always found the bottleneck is on the Oracle side. You can try tweaking the --num-mappers value and --fetch-size to see if that improves your speeds.