- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Can I place the OS for the nodes Hadoop Cluster on a SAN while the Local Disk and Hadoop components/ bits reside on Local disks
- Labels:
-
Apache Hadoop
Created 01-19-2016 07:25 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a Hadoop cluster, each node on a 2 X 8GB fabric interconnect (48 Port on one RACK) , each server has a dedicated 10 GB NIC for each one.
To save space on each node, I would like to put the OS on a SAN backed by this CICSO UCS Interconnect. All Hadoop data would be stored on locally on DAS on Data Nodes (JBOD)
All Master nodes (and Edge Node) disks would be RAID and contain the Master components.
Only the OS would be on a SAN instead of locally.
Are there any issues with this?
Created 01-19-2016 07:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I see no issues on having OS on disk backed up by SAN as long as there is strong bandwidth between Servers and SAN "in your case it is there"
On Separate note Cisco doc
Created 01-19-2016 07:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I see no issues on having OS on disk backed up by SAN as long as there is strong bandwidth between Servers and SAN "in your case it is there"
On Separate note Cisco doc
Created 01-19-2016 11:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The tricky part is spof in case san goes down.
Created 01-19-2016 07:39 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I would not put the OS on the SAN. Where would the OS Cache be configured. This is usually not done, what are the benefits of putting the OS on SAN? It is an interesting thoughts and if you do tryout do share the results.
Created 02-02-2016 04:27 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Ancil McBarnett accept best answer
Created 07-19-2016 06:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We can store application related data and logs on SAN/NAS
However SAN/NAS are not at all recommended for I/O sensitive and CPU bound jobs , that is to avoid bottleneck situations while reading data from disk or from network or in processing data
So for Logs/application data --> SAN/NAS
Data nodes data --> DAS with JBOD configuration NO RAID
NN/SN/JT nodes --> should be highly available [ RAID 5/10(depends on usecase) ]
Hadoop is a scale out and shared nothing architecture
http://www.bluedata.com/blog/2015/12/separating-hadoop-compute-and-storage/
Also I understand sometimes true cost of DAS is also more considering Hadoop replication , but this is how Hadoop is thriving (One of the key tenets of Hadoop is to bring the compute to the storage instead of the storage to the compute.)