Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

a particular set of data is missing in HDFS directory, but the data can be found in hive table

Highlighted

a particular set of data is missing in HDFS directory, but the data can be found in hive table

Explorer

Just curious why the data is missing in HDFS directory, and seeing the error below when trying to get the partition list from the table.

Error occurred executing hive query: Error while compiling statement: FAILED: SemanticException java.io.FileNotFoundException: File

hdfs://KSD01/tera/WW=0/test=JM does not exist.

but the partition can be found in the result of command :"show partitions <table_name>" command

1 REPLY 1

Re: a particular set of data is missing in HDFS directory, but the data can be found in hive table

Contributor

Looks like the data was removed from hdfs but no drop partition query was executed on hive side. The data being displayed from 'show partitions <table_name>' command retrieves data from hive metastore db, which has the information on table partition but not the real data, which is stored in hdfs. About the data missing from hdfs, it might be worth checking the hdfs-audit.log to trace the lifecycle of the directory/file

Don't have an account?
Coming from Hortonworks? Activate your account here