Support Questions

Find answers, ask questions, and share your expertise

HANDLING CASE SENSITIVE PARTITIONS FOLDER IN HIVE

avatar
Explorer

Hi Everyone,

 

I have hive partitions folder at HDFS location, but all the partitions folders are in upper case.

i.e. YEAR=2021/MONTH=07/DAY=31/HOUR=00 like this.

 

in hive when i am creating table it taking partition columns in lowercase

/year=2021/month=07/day=31/hour=00 like this.

 

since hdfs is case sensitive and hive is case insensitive so hive is expecting partitions column in lowercase at hdfs location and I am not able to see any partition in my hive table.

 

so is there any way to handle this case.either hive column in uppercase or Recursively change all hdfs partitions column in lowercase. I have  8000+ partitions (1day=24 hours=30 days=12 months(24*30*12=8640)  for 1 year so not able to rename every folder manually.

 

someone kindly suggest.

2 REPLIES 2

avatar
Master Collaborator

@Rohan44 

 

A simple one liner could help here:

 

for i in $(hdfs dfs -ls -R /tmp/| awk '{print $8}'| grep [A-Z] ); do hdfs dfs -mv $i `echo $i | tr 'A-Z' 'a-z'`; done

 

In this example I have directories with upper case names under /tmp. 

/tmp/MONTH=07/DAY=31/HOUR=00

 

As I am using a simple `mv`, when it renames the parent directory, it will fail to rename the child directories.

So, you might see 'no such file or directory' errors. But, run the same command a couple of times based on the depth of your partition directories. Run it as hdfs superuser.

 

avatar
Master Collaborator

@Rohan44 Please do test this command once before you run it on actual data. You could also take a backup of the hdfs data, to be safe.