i have create an external table in Hive pointing to a location in HDFS. Initially to test i have defined all the columns as strings and i could see the data from files in the table. now when i examine the data most of the columns are measures , so i will change those to integers. but some measure like percentages have a % . like 90% , 2.89% in it value. so if i define that as decimal it will fail.?? i want to define these as measures..how can i do that.??
You could create a new table that contains clean data by running it through a query:
create table cleaned_table as select cast(replace(measure1, '%', '') as double) as measure1, cast(replace(measure=2, '%', '') as double) as measure2, ... from external_table;