Created 09-22-2021 12:52 AM
Hi Everyone,
I have hive partitions folder at HDFS location, but all the partitions folders are in upper case.
i.e. YEAR=2021/MONTH=07/DAY=31/HOUR=00 like this.
in hive when i am creating table it taking partition columns in lowercase
/year=2021/month=07/day=31/hour=00 like this.
since hdfs is case sensitive and hive is case insensitive so hive is expecting partitions column in lowercase at hdfs location and I am not able to see any partition in my hive table.
so is there any way to handle this case.either hive column in uppercase or Recursively change all hdfs partitions column in lowercase. I have 8000+ partitions (1day=24 hours=30 days=12 months(24*30*12=8640) for 1 year so not able to rename every folder manually.
someone kindly suggest.
Created 09-22-2021 06:17 AM
A simple one liner could help here:
for i in $(hdfs dfs -ls -R /tmp/| awk '{print $8}'| grep [A-Z] ); do hdfs dfs -mv $i `echo $i | tr 'A-Z' 'a-z'`; done
In this example I have directories with upper case names under /tmp.
/tmp/MONTH=07/DAY=31/HOUR=00
As I am using a simple `mv`, when it renames the parent directory, it will fail to rename the child directories.
So, you might see 'no such file or directory' errors. But, run the same command a couple of times based on the depth of your partition directories. Run it as hdfs superuser.
Created 09-22-2021 06:19 AM
@Rohan44 Please do test this command once before you run it on actual data. You could also take a backup of the hdfs data, to be safe.