Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

a particular set of data is missing in HDFS directory, but the data can be found in hive table

avatar
Contributor

Just curious why the data is missing in HDFS directory, and seeing the error below when trying to get the partition list from the table.

Error occurred executing hive query: Error while compiling statement: FAILED: SemanticException java.io.FileNotFoundException: File

hdfs://KSD01/tera/WW=0/test=JM does not exist.

but the partition can be found in the result of command :"show partitions <table_name>" command

1 REPLY 1

avatar
Rising Star

Looks like the data was removed from hdfs but no drop partition query was executed on hive side. The data being displayed from 'show partitions <table_name>' command retrieves data from hive metastore db, which has the information on table partition but not the real data, which is stored in hdfs. About the data missing from hdfs, it might be worth checking the hdfs-audit.log to trace the lifecycle of the directory/file