I am trying to read a xlsx file and create a dataframe out of it. i have attached the xlsx file. Can anybody provide some clue on it. there are columns based on year and sub columns in it. the end goal is to create a dataframe out of it and store in a hive table for further analysis. if the year related columns are not there, then it is a straight forward one. Any thoughts or help would be great. here we have year 2004 as example and 5 columns under it. similarly there will be many years and corresponding sub columns.