Support Questions

Find answers, ask questions, and share your expertise

sqoop import

avatar
Master Collaborator

Hi:

after run this script, the map reduce doesnt finish, here the script and the log:

sqoop import -D oraoop.disabled=true  --connect jdbc:oracle:thin:@HOSTNAME:2521/CIP_BATCH --username=XXXXXX --password=XXXXXX --table RDWC01.MI_CLTE_ECO_GEN \
--columns "MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H" \
--where "COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD')" \
--target-dir=/RSI/datalake/desercion/2016/1 --verbose --split-by COD_NRBE_EN
16/06/07 13:11:02 WARN db.TextSplitter: Generating splits for a textual index column.
16/06/07 13:11:02 WARN db.TextSplitter: If your database sorts in a case-insensitive order, this may result in a partial import or duplicate records.
16/06/07 13:11:02 WARN db.TextSplitter: You are strongly encouraged to choose an integral split column.
16/06/07 13:11:02 WARN db.BigDecimalSplitter: Set BigDecimal splitSize to MIN_INCREMENT
16/06/07 13:11:02 DEBUG db.DataDrivenDBInputFormat: Creating input split with lower bound 'COD_NRBE_EN >= '3159'' and upper bound 'COD_NRBE_EN <= '3159''
16/06/07 13:11:02 INFO mapreduce.JobSubmitter: number of splits:1
16/06/07 13:11:02 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1464163049638_0629
16/06/07 13:11:03 INFO impl.YarnClientImpl: Submitted application application_1464163049638_0629
16/06/07 13:11:03 INFO mapreduce.Job: The url to track the job: http://lnxbig05.cajarural.gcr:8088/proxy/application_1464163049638_0629/
16/06/07 13:11:03 INFO mapreduce.Job: Running job: job_1464163049638_0629
16/06/07 13:11:12 INFO mapreduce.Job: Job job_1464163049638_0629 running in uber mode : false
16/06/07 13:11:12 INFO mapreduce.Job:  map 0% reduce 0%

any suggestions???

Thanks

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Hi:

its ok now, its work with this;

--query "select ID_INTERNO_PE,MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H from RDWC01.MI_CLTE_ECO_GEN where \$CONDITIONS AND COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD')" \


NoteIf you are issuing the query wrapped with double quotes ("), you will have to use \$CONDITIONS instead of just $CONDITIONS to disallow your shell from treating it as a shell variable. For example, a double quoted query may look like: "SELECT * FROM x WHERE a='foo' AND \$CONDITIONS"

Many thanks all of you.

View solution in original post

18 REPLIES 18

avatar
Expert Contributor

yes you can. Also, to investigate why it is using only one mapper, can you check your mapper size and size of that you are trying to import? How did you calculate your split size?

avatar
Master Collaborator

hi:

I am trying with default values fisrt, i am spliting by id_person, that is integer but look the message:

16/06/07 20:12:03 WARN manager.OracleManager: The table RDWC01.MI_CLTE_ECO_GEN contains a multi-column primary key. Sqoop will default to the column COD_NRBE_EN only for this job.
16/06/07 20:12:03 INFO manager.OracleManager: Time zone has been set to GMT
16/06/07 20:12:03 WARN manager.OracleManager: The table RDWC01.MI_CLTE_ECO_GEN contains a multi-column primary key. Sqoop will default to the column COD_NRBE_EN only for this job.

avatar
Expert Contributor

Is id_person part of primary key? If it is part of primary_key, it won't work.

avatar
Master Collaborator

yes, id_person is part of the primary key....so i need to look another colum that is not a primary key and also integer¿

thanks

avatar
Expert Contributor

yes, you will have to choose some other column.

avatar
Expert Contributor

yes, you will have to choose some other column.

avatar
Master Collaborator

Thanks,still doesnt split, I dont know why, is necesary to use $CONDITIONS???

avatar
Master Collaborator

Hi:

why my $CONDITIONS doesnt work??

16/06/08 09:52:53 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Query [select ID_INTERNO_PE,MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H from RDWC01.MI_CLTE_ECO_GEN where COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD') ] must contain '$CONDITIONS' in WHERE clause.
        at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:300)
        at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1845)
        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1645)
        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:107)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:478)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:244)


sqoop import -D oraoop.disabled=true --verbose \
--connect jdbc:oracle:thin:@hostname:2521/CIP_BATCH  \
--username=U029550 \
--password=Mayo2016 \
--query "select ID_INTERNO_PE,MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H from RDWC01.MI_CLTE_ECO_GEN where COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD') $CONDITIONS" \
--boundary-query "select min(MI_NUM_TOT_AC_ACT), max(MI_NUM_TOT_AC_ACT) from RDWC01.MI_CLTE_ECO_GEN" \
--split-by MI_NUM_TOT_AC_ACT \
--direct \
--target-dir=/RSI/datalake/desercion/2016/1 \


avatar
Master Collaborator

Hi:

its ok now, its work with this;

--query "select ID_INTERNO_PE,MI_FECHA_FIN_MES,COD_NRBE_EN,COD_LINEA,ID_GRP_PD,MI_NUM_TOT_AC_ACT,MI_NUM_AC_SUS,MI_SDO_AC_P,MI_NUM_AC_P,MI_DIA_AC_P,MI_INT_DEV_ACR_D,MI_INT_DEV_DEU_D,MI_COMIS_APL_D,MI_TOT_MOV_D,MI_TOT_MOV_H,MI_TOT_IMP_MOV_D,MI_TOT_IMP_MOV_H from RDWC01.MI_CLTE_ECO_GEN where \$CONDITIONS AND COD_RL_PERS_AC = 01 AND COD_LINEA in ('01','03','04','05') AND COD_NRBE_EN = '3159' AND TRUNC(MI_FECHA_FIN_MES) >=TO_DATE('2010-01-01', 'YYYY-MM-DD')" \


NoteIf you are issuing the query wrapped with double quotes ("), you will have to use \$CONDITIONS instead of just $CONDITIONS to disallow your shell from treating it as a shell variable. For example, a double quoted query may look like: "SELECT * FROM x WHERE a='foo' AND \$CONDITIONS"

Many thanks all of you.