Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Datanode data directory permissions

avatar
Explorer

Since documents say set directory permission to 700 for datanode data directories( owned by hdfs), and Yarn/Node Manager/Containers will run as mapred userid, then how do mapred programs access blocks , based on my understanding mapred user wont be able to do ls -ltr ${dfs.datanode.data.dir} ? OR this is somehow driven by how file has permission set within hdfs?

 

Thanks for the help!!

 

 

Regards,

Dev

1 ACCEPTED SOLUTION

avatar
Mentor
A few points:

1. Remote/non-short-circuit-local block reads are done via the DN IPC port (50010 or 1004), and not via local filesystem, so HDFS clients of any form will never require local permission access to the block files. The DN reads and streams the data. Same applies for writes.
2. Local short circuit reads, if enabled and applicable, are done in a secure manner via use of Unix Socket Domains. The DN opens the file and passes the file descriptor to the client, bypassing the need for clients to be able to read block file paths directly.

View solution in original post

2 REPLIES 2

avatar
Mentor
A few points:

1. Remote/non-short-circuit-local block reads are done via the DN IPC port (50010 or 1004), and not via local filesystem, so HDFS clients of any form will never require local permission access to the block files. The DN reads and streams the data. Same applies for writes.
2. Local short circuit reads, if enabled and applicable, are done in a secure manner via use of Unix Socket Domains. The DN opens the file and passes the file descriptor to the client, bypassing the need for clients to be able to read block file paths directly.

avatar
Explorer

Thanks!