Member since
09-05-2024
2
Posts
1
Kudos Received
0
Solutions
06-25-2025
02:58 AM
Hello, We are using hdfs 2.7.3, and we have set the storage policy of one folder to cold (Archive). We have restarted dediated services on the datanodes (the namenode services weren't asked to be restarted). I know that we have to launch a mover process to move the existing blocks which violate the storage policy. However, it looks like the new data is not automatically write into the cold storage and still need the mover. How does it work? Did I miss anything? Is there a way to make the block placement automatic to the cold storage location? *What I'm sure: the storage policy is applied (checked from getStoragePolicy and fsck command) Thank you.
... View more
Labels:
- Labels:
-
Hortonworks Data Platform (HDP)
09-06-2024
03:35 AM
1 Kudo
Hello, We are tring to use hadoop distcp command to transfer data from on-prem hadoop cluster to GCS by using a private endpoint. We have tested many different method, and finally we added directly the following lines into /etc/hosts. (Other GCP auth info has been added into core-site.xml.) XX.XX.XX.XX storage.googleapis.com XX.XX.XX.XX googleapis.com *XX.XX.XX.XX is the IP of our private endpoint hadoop fs -ls and -cp command can list or copy object correctly to the GCS bucket. But with hadoop distcp command, its mapreduce jobs always go to public endpoint... Does anyone knows how to make the distcp working with GCS private endpoint? Or does anyone have any idea about the private endpoint of distcp trying to reach (so that I can add it into /etc/hosts)? Thank you.
... View more
Labels:
- Labels:
-
MapReduce