Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

changing temp directory in sqoop import

Highlighted

changing temp directory in sqoop import

Explorer

A ".temp" directory is getting created in the target directory while importing data using sqoop. I want to change the directory location where the .temp/job_1569164528775_11264/mr/* directories are getting created.

 

command:

 

sqoop import -D yarn.app.mapreduce.am.staging-dir=/user/test/ --driver "com.microsoft.sqlserver.jdbc.SQLServerDriver" --connect "jdbc:sqlserver://ip:port;database=database;" --connection-manager "org.apache.sqoop.manager.SQLServerManager" --username <username> -password <password> --table 'tablename' --as-parquetfile --delete-target-dir --target-dir /user/database/test/ --m 1

 

result now:

 

hadoop fs -ls -R /user/database/test/
/user/database/test/.temp
/user/database/test/.temp/job_1569164528775_11264/mr/.....*

 

Expected result:

 

hadoop fs -ls -R /user/test/
/user/test/.temp/job_1569164528775_11264/mr/.....*

 

tried the below aswell:

mapred.system.dir
mapreduce.task.output.dir
mapreduce.task.tmp.dir
mapred.work.output.dir
mapreduce.cluster.temp.dir
mapreduce.cluster.local.dir
mapred.temp.dir
mapreduce.jobtracker.system.dir
hadoop.tmp.dir

 

 

8 REPLIES 8

Re: changing temp directory in sqoop import

Mentor

@sow 

Have you tried changing your  --target-dir /user/database/test/ --m 1

$ sqoop import -D yarn.app.mapreduce.am.staging-dir=/user/test/ --driver "com.microsoft.sqlserver.jdbc.SQLServerDriver" --connect "jdbc:sqlserver://ip:port;database=database;" --connection-manager "org.apache.sqoop.manager.SQLServerManager" --username <username> -password <password> --table 'tablename' --as-parquetfile --delete-target-dir --target-dir /user/test/ --m 1

 

While running hive import target-dir argument value controls where the data needs to store temporarily before loading into Hive table, but target-dir doesn't create hive table in that location.

If you want to import to specific directory then use target-dir without hive-import argument and create hive table on top of HDFS directory.


HTH

Re: changing temp directory in sqoop import

Explorer

Hi @Shelton 

 

I am not using hive-import or trying to create hive table.

 

The issue here is that  a .temp directory that gets created at the target-dir i specify.

The .temp will  be deleted once the execution is complete but my use case is to change the directory where the .temp is getting created during run time.

 

 

Re: changing temp directory in sqoop import

Mentor

@sow 

 

I am not using hive-import or trying to create a hive table. but below is your code I can see target -dir

targetdir.PNG

 Thanks

Re: changing temp directory in sqoop import

Explorer

@Shelton 

 

The target-dir is specified so that the generated files are placed in that directory.

The problem I am facing here is with the temporary directory (.temp) that gets created during run time (i.e when the mapreduce job is initiated) at the target -dir.

The solution I am looking for is to change the temporary directory(.temp) location.

Re: changing temp directory in sqoop import

Expert Contributor

Hi,

 

You can use this tag "--temporary-rootdir" to make sure that the temporary data goes into this folder.

 

Example:- [example]
sqoop import --target-dir /<hdfs path>/<import_data_dir> --temporary-rootdir /<dfs path>/<tmp_dir> ...

 

Regards

Nitish

Re: changing temp directory in sqoop import

Explorer

@Khanna 

 

I tried it. but the .temp is still getting created at the target directory that i specify.

 

 

Re: changing temp directory in sqoop import

Expert Contributor

Hi,

 

Can you please share the sqoop command that you are running?

 

Regards

Nitish

Re: changing temp directory in sqoop import

Explorer

@Khanna 

 

 

command used :
sqoop import --driver "com.microsoft.sqlserver.jdbc.SQLServerDriver" --connect "jdbc:sqlserver://server:port;database=db_name;" --connection-manager "org.apache.sqoop.manager.SQLServerManager" --username <> -password <> --table 'table_name' --as-parquetfile --delete-target-dir --target-dir /user/test/axe/ --temporary-rootdir /user/test2/ --m 4 --split-by user_id

 

before mapreduce starts:
hadoop fs -ls -R /user/test/
/user/test/axe
/user/test/axe/.metadata
/user/test/axe/.metadata/descriptor.properties
/user/test/axe/.metadata/schema.avsc
/user/test/axe/.metadata/schemas
/user/test/axe/.metadata/schemas/1.avsc

 

when the mapreduce job starts:
hadoop fs -ls -R /user/test/
/user/test/.temp
/user/test/.temp/job_1571067970221_0156
/user/test/.temp/job_1571067970221_0156/mr
/user/test/.temp/job_1571067970221_0156/mr/job_1571067970221_0156
/user/test/.temp/job_1571067970221_0156/mr/job_1571067970221_0156/.metadata
/user/test/.temp/job_1571067970221_0156/mr/job_1571067970221_0156/.metadata/descriptor.properties
/user/test/.temp/job_1571067970221_0156/mr/job_1571067970221_0156/.metadata/schema.avsc
/user/test/.temp/job_1571067970221_0156/mr/job_1571067970221_0156/.metadata/schemas
/user/test/.temp/job_1571067970221_0156/mr/job_1571067970221_0156/.metadata/schemas/1.avsc
/user/test/axe
/user/test/axe/.metadata
/user/test/axe/.metadata/descriptor.properties
/user/test/axe/.metadata/schema.avsc
/user/test/axe/.metadata/schemas
/user/test/axe/.metadata/schemas/1.avsc

 

once the import is complete:
hadoop fs -ls -R /user/test/
/user/test/axe
/user/test/axe/.metadata
/user/test/axe/.metadata/descriptor.properties
/user/test/axe/.metadata/schema.avsc
/user/test/axe/.metadata/schemas
/user/test/axe/.metadata/schemas/1.avsc
/user/test/axe/.signals
/user/test/axe/.signals/unbounded
/user/test/axe/679dadfc-3657-410b-a805-6c98b8d1720b.parquet
/user/test/axe/923bbf06-35d6-4156-8df3-a6e53ebf00f7.parquet
/user/test/axe/aef24eac-15dd-4ebc-a61a-68265d53a320.parquet
/user/test/axe/f903a079-87b3-48a0-bea6-aa02d92c5aac.parquet

 

Don't have an account?
Coming from Hortonworks? Activate your account here