Support Questions

Find answers, ask questions, and share your expertise

Do new data arriving into a cold storage folder on hdfs need to be moved by a mover?

avatar
New Contributor

Hello,

We are using hdfs 2.7.3, and we have set the storage policy of one folder to cold (Archive). We have restarted dediated services on the datanodes (the namenode services weren't asked to be restarted).

I know that we have to launch a mover process to move the existing blocks which violate the storage policy. However, it looks like the new data is not automatically write into the cold storage and still need the mover. 

How does it work? Did I miss anything? Is there a way to make the block placement automatic to the cold storage location?

*What I'm sure: the storage policy is applied (checked from getStoragePolicy and fsck command)

Thank you. 

1 REPLY 1

avatar
Super Collaborator

Hi, @Hz 

In HDFS 2.7.3, setting a storage policy on a directory does not immediately place new blocks directly into the target storage (e.g., ARCHIVE). New writes still go to default storage (usually DISK), and the Mover process is required to relocate both existing and newly written blocks to comply with the policy. The storage policy only marks the desired storage type, but actual enforcement happens through the Mover. This is expected behavior and you did not miss any configuration. There’s no way in 2.7.3 to bypass the Mover and force blocks to land directly in cold storage on write. Later Hadoop versions introduced improvements, but for your version, running the Mover is required.