Created 10-13-2016 03:16 AM
HI..I want to migrate some hive table in Prod cluster to dev Cluster to i am doing like this
#export the hive table in some tem directory
#distcp the tem directory to tem directory in target cluster
#import the tem directory to hive database.
#01 hdfs@HADOOProot> hadoop fs -mkdir /apps/hive/warehouse/sankar5_dir
#02 export table db_c720_dcm.network_matchtables_act_creative to 'apps/hive/warehouse/sankar5_dir';
#03 hadoop distcp hdfs://xx.xx.xx.xx:8020/apps/hive/warehouse/sankar5_dir hdfs://xx.xx.xx.xx//apps/hive/warehouse/sankar5_dir
FAILED: SemanticException [Error 10027]: Invalid path on 3 step
I could import in source cluster but after distcp ,i cont import in destination cluster
Created 10-13-2016 04:03 AM
Can you check if the export table command result is stored in which directory?
export table db_c720_dcm.network_matchtables_act_creative to 'apps/hive/warehouse/sankar5_dir';
Actually when you execute the above command, the final data will be written to /user/<user_name>/apps/hive/warehouse/sankar5_dir directory in HDFS (of course, it will need to be writable by the current user).
So, please make the path exists in the expected directory before executing the distcp comamnd.
Created 10-13-2016 04:03 AM
Can you check if the export table command result is stored in which directory?
export table db_c720_dcm.network_matchtables_act_creative to 'apps/hive/warehouse/sankar5_dir';
Actually when you execute the above command, the final data will be written to /user/<user_name>/apps/hive/warehouse/sankar5_dir directory in HDFS (of course, it will need to be writable by the current user).
So, please make the path exists in the expected directory before executing the distcp comamnd.
Created 10-13-2016 04:15 AM
Thanks you so much@Ayub Pathan
I have below information on user directory
hdfs@HADOOP:/root> hadoop fs -ls /user/hdfs/apps/hive/warehouse/sankar5_dir
Found 2 items
-rw-r--r-- 3hdfs hdfs 1882 2016-10-12 17:34 /user/hdfs/apps/hive/warehouse/sankar5_dir/_metadata
drwxr-xr-x - hdfs hdfs 0 2016-10-12 17:34 /user/hdfs/apps/hive/warehouse/sankar5_dir/data
I could able to import in source cluster but i could not in destination cluster after distcp
Created 10-13-2016 04:27 AM
After distcp, do you see the same directory structure in target cluster? If yes, you should be able to import on target cluster as well.
Created 10-13-2016 05:42 AM
No con't see this directory..may i know reason for this ..and help me to out of this issue..
Created 10-13-2016 06:04 AM
@rama Is source and target cluster running the same hdfs version? If no, then use the below command.
hadoop distcp webhdfs://namenode1:<port>/source/dir webhdfs://namenode2:<port>/destination/dir
NameNode URI and NameNode HTTP port should be provided in the source and destination command, if you are using webhdfs.
Also make sure to provide absolute paths while using distcp. (https://hadoop.apache.org/docs/r1.2.1/distcp.html).
In the actual question, I also observed that you are not using port number for the target cluster url..
hadoop distcp hdfs://xx.xx.xx.xx:8020/apps/hive/warehouse/sankar5_dir hdfs://xx.xx.xx.xx:<port>//apps/hive/warehouse/sankar5_dir
Created 10-13-2016 06:20 AM
Both are using same version HDP 2.1.2
I for got mansion port but it has 8020 both cluster
export table db_c720_dcm.network_matchtables_act_ad to 'apps/hive/warehouse/sankar7_dir';
and i could see sankar7_dir in /user/hdfs/apps/hive/warehouse/sankar7_dir in source cluster...
hadoop distcp hdfs://xx.xx.xx.xx:8020/apps/hive/warehouse/sankar7_dir hdfs://yy.yy.yy.yy:8020/apps/hive/warehous e/sankar7_dir 16/10/13 01:01:05 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=fa lse, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[hdfs:///xx.xx.xx.xx:8020/apps/h ive/warehouse/sankar7_dir], targetPath=hdfs://yy.yy.yy.yy4:8020/apps/hive/warehouse/sankar7_dir} 16/10/13 01:01:05 INFO client.RMProxy:
Connecting to ResourceManager at stlts8711/39.0.8.13:8050 16/10/13 01:01:06 ERROR tools.DistCp: Invalid input: org.apache.hadoop.tools.CopyListing$InvalidInputException: hdfs:///xx.xx.xx.xx:8020/apps/hive/warehouse/sankar7_dir doesn't exist at
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:84) at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:80) at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:327) at org.apache.hadoop.tools.DistCp.execute(DistCp.java:151) at org.apache.hadoop.tools.DistCp.run(DistCp.java:118) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.tools.DistCp.main(DistCp.java:375)
hdfs:///xx.xx.xx.xx:8020/apps/hive/warehouse/sankar7_dir doesn't existIif see my error while doing Distcp with out creating sankar_7. But i export table to directory :.
export table db_c720_dcm.network_matchtables_act_ad to 'apps/hive/warehouse/sankar7_dir';