Support Questions

Find answers, ask questions, and share your expertise

Hive table migration from one cluster to another using distcp, FAILED: SemanticException [Error 10027]: Invalid path on import statement in Target cluster..

avatar
Expert Contributor

HI..I want to migrate some hive table in Prod cluster to dev Cluster to i am doing like this

#export the hive table in some tem directory

#distcp the tem directory to tem directory in target cluster

#import the tem directory to hive database.

#01 hdfs@HADOOProot> hadoop fs -mkdir /apps/hive/warehouse/sankar5_dir

#02 export table db_c720_dcm.network_matchtables_act_creative to 'apps/hive/warehouse/sankar5_dir';

#03 hadoop distcp hdfs://xx.xx.xx.xx:8020/apps/hive/warehouse/sankar5_dir hdfs://xx.xx.xx.xx//apps/hive/warehouse/sankar5_dir

FAILED: SemanticException [Error 10027]: Invalid path on 3 step

I could import in source cluster but after distcp ,i cont import in destination cluster

1 ACCEPTED SOLUTION

avatar
@rama

Can you check if the export table command result is stored in which directory?

export table db_c720_dcm.network_matchtables_act_creative to 'apps/hive/warehouse/sankar5_dir';

Actually when you execute the above command, the final data will be written to /user/<user_name>/apps/hive/warehouse/sankar5_dir directory in HDFS (of course, it will need to be writable by the current user).

So, please make the path exists in the expected directory before executing the distcp comamnd.

View solution in original post

6 REPLIES 6

avatar
@rama

Can you check if the export table command result is stored in which directory?

export table db_c720_dcm.network_matchtables_act_creative to 'apps/hive/warehouse/sankar5_dir';

Actually when you execute the above command, the final data will be written to /user/<user_name>/apps/hive/warehouse/sankar5_dir directory in HDFS (of course, it will need to be writable by the current user).

So, please make the path exists in the expected directory before executing the distcp comamnd.

avatar
Expert Contributor

Thanks you so much@Ayub Pathan

I have below information on user directory

hdfs@HADOOP:/root> hadoop fs -ls /user/hdfs/apps/hive/warehouse/sankar5_dir

Found 2 items

-rw-r--r-- 3hdfs hdfs 1882 2016-10-12 17:34 /user/hdfs/apps/hive/warehouse/sankar5_dir/_metadata

drwxr-xr-x - hdfs hdfs 0 2016-10-12 17:34 /user/hdfs/apps/hive/warehouse/sankar5_dir/data

I could able to import in source cluster but i could not in destination cluster after distcp

avatar

After distcp, do you see the same directory structure in target cluster? If yes, you should be able to import on target cluster as well.

avatar
Expert Contributor

@Ayub Pathan

No con't see this directory..may i know reason for this ..and help me to out of this issue..

avatar

@rama Is source and target cluster running the same hdfs version? If no, then use the below command.

hadoop distcp webhdfs://namenode1:<port>/source/dir webhdfs://namenode2:<port>/destination/dir

NameNode URI and NameNode HTTP port should be provided in the source and destination command, if you are using webhdfs.

Also make sure to provide absolute paths while using distcp. (https://hadoop.apache.org/docs/r1.2.1/distcp.html).

In the actual question, I also observed that you are not using port number for the target cluster url..

hadoop distcp hdfs://xx.xx.xx.xx:8020/apps/hive/warehouse/sankar5_dir hdfs://xx.xx.xx.xx:<port>//apps/hive/warehouse/sankar5_dir

avatar
Expert Contributor

@Ayub Pathan

Both are using same version HDP 2.1.2

I for got mansion port but it has 8020 both cluster

export table db_c720_dcm.network_matchtables_act_ad to 'apps/hive/warehouse/sankar7_dir';

and i could see sankar7_dir in /user/hdfs/apps/hive/warehouse/sankar7_dir in source cluster...

hadoop distcp hdfs://xx.xx.xx.xx:8020/apps/hive/warehouse/sankar7_dir hdfs://yy.yy.yy.yy:8020/apps/hive/warehous e/sankar7_dir 16/10/13 01:01:05 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=fa lse, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[hdfs:///xx.xx.xx.xx:8020/apps/h ive/warehouse/sankar7_dir], targetPath=hdfs://yy.yy.yy.yy4:8020/apps/hive/warehouse/sankar7_dir} 16/10/13 01:01:05 INFO client.RMProxy:

Connecting to ResourceManager at stlts8711/39.0.8.13:8050 16/10/13 01:01:06 ERROR tools.DistCp: Invalid input: org.apache.hadoop.tools.CopyListing$InvalidInputException: hdfs:///xx.xx.xx.xx:8020/apps/hive/warehouse/sankar7_dir doesn't exist at

org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:84) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:80) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:327) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:151) at org.apache.hadoop.tools.DistCp.run(DistCp.java:118) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.tools.DistCp.main(DistCp.java:375)

hdfs:///xx.xx.xx.xx:8020/apps/hive/warehouse/sankar7_dir doesn't existIif see my error while doing Distcp with out creating sankar_7. But i export table to directory :.

export table db_c720_dcm.network_matchtables_act_ad to 'apps/hive/warehouse/sankar7_dir';