Created 06-07-2016 11:15 AM
Hi:
after run this script, the map reduce doesnt finish, here the script and the log:
sqoop import -D oraoop.disabled=true --connect jdbc:oracle:thin:@HOSTNAME:2521/CIP_BATCH --username=XXXXXX --password=XXXXXX --table RDWC01.MI_CLTE_ECO_GEN \ --columns "MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H" \ --where "COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD')" \ --target-dir=/RSI/datalake/desercion/2016/1 --verbose --split-by COD_NRBE_EN
16/06/07 13:11:02 WARN db.TextSplitter: Generating splits for a textual index column. 16/06/07 13:11:02 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records. 16/06/07 13:11:02 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column. 16/06/07 13:11:02 WARN db.BigDecimalSplitter: Set BigDecimal splitSize to MIN_INCREMENT 16/06/07 13:11:02 DEBUG db.DataDrivenDBInputFormat: Creating input split with lower bound 'COD_NRBE_EN >= '3159'' and upper bound 'COD_NRBE_EN <= '3159'' 16/06/07 13:11:02 INFO mapreduce.JobSubmitter: number of splits:1 16/06/07 13:11:02 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1464163049638_0629 16/06/07 13:11:03 INFO impl.YarnClientImpl: Submitted application application_1464163049638_0629 16/06/07 13:11:03 INFO mapreduce.Job: The url to track the job: http://lnxbig05.cajarural.gcr:8088/proxy/application_1464163049638_0629/ 16/06/07 13:11:03 INFO mapreduce.Job: Running job: job_1464163049638_0629 16/06/07 13:11:12 INFO mapreduce.Job: Job job_1464163049638_0629 running in uber mode : false 16/06/07 13:11:12 INFO mapreduce.Job: map 0% reduce 0%
any suggestions???
Thanks
Created 06-08-2016 08:00 AM
Hi:
its ok now, its work with this;
--query "select ID_INTERNO_PE,MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H from RDWC01.MI_CLTE_ECO_GEN where \$CONDITIONS AND COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD')" \
NoteIf you are issuing the query wrapped with double quotes ("), you will have to use \$CONDITIONS instead of just $CONDITIONS to disallow your shell from treating it as a shell variable. For example, a double quoted query may look like: "SELECT * FROM x WHERE a='foo' AND \$CONDITIONS"
Many thanks all of you.
Created 06-07-2016 06:03 PM
yes you can. Also, to investigate why it is using only one mapper, can you check your mapper size and size of that you are trying to import? How did you calculate your split size?
Created 06-07-2016 06:10 PM
hi:
I am trying with default values fisrt, i am spliting by id_person, that is integer but look the message:
16/06/07 20:12:03 WARN manager.OracleManager: The table RDWC01.MI_CLTE_ECO_GEN contains a multi-column primary key. Sqoop will default to the column COD_NRBE_EN only for this job. 16/06/07 20:12:03 INFO manager.OracleManager: Time zone has been set to GMT 16/06/07 20:12:03 WARN manager.OracleManager: The table RDWC01.MI_CLTE_ECO_GEN contains a multi-column primary key. Sqoop will default to the column COD_NRBE_EN only for this job.
Created 06-07-2016 06:40 PM
Is id_person part of primary key? If it is part of primary_key, it won't work.
Created 06-07-2016 06:45 PM
yes, id_person is part of the primary key....so i need to look another colum that is not a primary key and also integer¿
thanks
Created 06-07-2016 07:08 PM
yes, you will have to choose some other column.
Created 06-07-2016 07:08 PM
yes, you will have to choose some other column.
Created 06-07-2016 08:02 PM
Thanks,still doesnt split, I dont know why, is necesary to use $CONDITIONS???
Created 06-08-2016 07:53 AM
Hi:
why my $CONDITIONS doesnt work??
16/06/08 09:52:53 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Query [select ID_INTERNO_PE,MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H from RDWC01.MI_CLTE_ECO_GEN where COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD') ] must contain '$CONDITIONS' in WHERE clause. at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:300) at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1845) at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645) at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107) at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478) at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605) at org.apache.sqoop.Sqoop.run(Sqoop.java:148) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226) at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235) at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
sqoop import -D oraoop.disabled=true --verbose \ --connect jdbc:oracle:thin:@hostname:2521/CIP_BATCH \ --username=U029550 \ --password=Mayo2016 \ --query "select ID_INTERNO_PE,MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H from RDWC01.MI_CLTE_ECO_GEN where COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD') $CONDITIONS" \ --boundary-query "select min(MI_NUM_TOT_AC_ACT), max(MI_NUM_TOT_AC_ACT) from RDWC01.MI_CLTE_ECO_GEN" \ --split-by MI_NUM_TOT_AC_ACT \ --direct \ --target-dir=/RSI/datalake/desercion/2016/1 \
Created 06-08-2016 08:00 AM
Hi:
its ok now, its work with this;
--query "select ID_INTERNO_PE,MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H from RDWC01.MI_CLTE_ECO_GEN where \$CONDITIONS AND COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD')" \
NoteIf you are issuing the query wrapped with double quotes ("), you will have to use \$CONDITIONS instead of just $CONDITIONS to disallow your shell from treating it as a shell variable. For example, a double quoted query may look like: "SELECT * FROM x WHERE a='foo' AND \$CONDITIONS"
Many thanks all of you.