Member since
04-05-2019
2
Posts
0
Kudos Received
0
Solutions
04-12-2019
08:08 PM
Hi, On top of what Gomath has suggested, I also have another idea: 1. assuming that your current data on HDFS is /user/hive/wareshouse/dbname/tablename/columnA=1 /user/hive/wareshouse/dbname/tablename/columnA=2 /user/hive/wareshouse/dbname/tablename/columnA=3 /user/hive/wareshouse/dbname/tablename/columnA=4 .... and you want it to become: /user/hive/wareshouse/dbname/tablename/columnA=1/columnB=1 /user/hive/wareshouse/dbname/tablename/columnA=2/columnB=1 /user/hive/wareshouse/dbname/tablename/columnA=3//columnB=1 /user/hive/wareshouse/dbname/tablename/columnA=4/columnB=1 ... 2. create a new table that has the same columns, but with extra partition columnB, assuming that you will create under HDFS in /user/hive/wareshouse/dbname/new_tablename 3. you can create a new directory with below structure: /user/hive/wareshouse/dbname/new_tablename/columnA=1 /user/hive/wareshouse/dbname/new_tablename/columnA=2 /user/hive/wareshouse/dbname/new_tablename/columnA=3 /user/hive/wareshouse/dbname/new_tablename/columnA=4 ... 4. create the top level partition directories for the new table hadoop fs -mkdir /user/hive/wareshouse/dbname/new_tablename/columnA=1 hadoop fs -mkdir /user/hive/wareshouse/dbname/new_tablename/columnA=2 hadoop fs -mkdir /user/hive/wareshouse/dbname/new_tablename/columnA=3 hadoop fs -mkdir /user/hive/wareshouse/dbname/new_tablename/columnA=4 5. and then MOVE data from old table into new table's partition: hadoop fs -mv /user/hive/wareshouse/dbname/tablename/columnA=1 /user/hive/wareshouse/dbname/new_tablename/columnA=1/columnB=1 hadoop fs -mv /user/hive/wareshouse/dbname/tablename/columnA=2 /user/hive/wareshouse/dbname/new_tablename/columnA=2/columnB=1 hadoop fs -mv /user/hive/wareshouse/dbname/tablename/columnA=3 /user/hive/wareshouse/dbname/new_tablename/columnA=3/columnB=1 hadoop fs -mv /user/hive/wareshouse/dbname/tablename/columnA=4 /user/hive/wareshouse/dbname/new_tablename/columnA=4/columnB=1 .... What's the value for columnB you will decide, since it is a new partition. 6. get back to hive and then run below command: MSCK REPAIR TABLE dbname.new_tablename; This will help to recover all the missing partitions 7. Run: SELECT * FROM dbname.new_tablename to confirm if data is correct. This way, you do not need to duplicate any data on HDFS. Hope above helps.
... View more