Member since
06-17-2015
61
Posts
20
Kudos Received
4
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2034 | 01-21-2017 06:18 PM | |
2472 | 08-19-2016 06:24 AM | |
1773 | 06-09-2016 03:23 AM | |
2984 | 05-27-2016 08:27 AM |
08-01-2016
07:09 AM
Thanks a lot Kuldeep i agree and thats why i wanted suggestions from experts like you 🙂
... View more
08-01-2016
06:11 AM
1 Kudo
Hi Team, I have 3 virtual machines in HDP cluster ,if i have huge capacity in data nodes disk in TBs
so can i use the same disk with diff mount points to store Data nodes data, NN namenode data, SN data, JT data (master node data) and /usr and /var . I know then if my disk has some issue then all data will be affected
basically i wanted to know if my data node disks have lot of space in TBs, so do you recommend creating diff mounts on same data node disks for diff purposes like /usr,/var and storing NN SN JT data Also each HDP version data is in /usr/hdp
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache HBase
07-19-2016
06:02 AM
We can store application related data and logs on SAN/NAS However
SAN/NAS are not at all recommended for I/O sensitive and CPU bound jobs , that
is to avoid bottleneck situations while reading data from disk or from network
or in processing data So
for Logs/application data --> SAN/NAS Data
nodes data --> DAS with JBOD
configuration NO RAID NN/SN/JT nodes --> should be highly available [ RAID
5/10(depends on usecase) ] Hadoop
is a scale out and shared nothing architecture http://www.bluedata.com/blog/2015/12/separating-hadoop-compute-and-storage/ https://community.emc.com/servlet/JiveServlet/previewBody/41473-102-1-132603/Virtualizing%20Hadoop%20in%20Large%20Scale%20Infrastructures.pdf Also I understand sometimes
true cost of DAS is also more considering Hadoop replication , but this is how
Hadoop is thriving (One of the key tenets of Hadoop is to bring the compute to
the storage instead of the storage to the compute.)
... View more
07-19-2016
05:55 AM
@Sbandaru: i researched over this more deeply and conclusion is , we don't need edge node
We don’t
need edge node if Hadoop cluster and application are in same network its only needed when hadoop cluster and application are in diff network , at that time edge node acts as a gateway to hadoop cluster ( like a proxy ) thanks for your inputs
... View more
07-16-2016
03:34 AM
1 Kudo
Hi Team,
We are going to deploy HDP 2.3.4 for Big Env setup
Can Some one Please explain me the architecture of Edge node in hadoop . I am able to find only the definition on the internet. I have some queries
1)What is edge node? 2) when and why do we need it ? 3) does every production cluster contain this edge node? 4) Does the edge node a part of the cluster (What advantages do we have if it is inside the cluster . Does it store any blocks of data in hdfs. any performance improvement? 5)Should the edge node be outside the cluster . 6) Please refer any docs where i can know about it. Preferably Hortonworks docs
... View more
Labels:
- Labels:
-
Apache Hadoop
07-13-2016
05:32 PM
will be helpful
... View more
07-13-2016
02:20 PM
answer looks goo d.. Thanks for your answer can you please advise how to decide DiskIO in cluster ? which factors to consider for Disk I/O calculation ?
... View more
07-13-2016
09:31 AM
1 Kudo
Hi Team, can someone kindly advise How to plan a hortonworks hadoop cluster if my application is not running any map reduce jobs .. and i will be loading 250 GB data in hbase i understand we need to take care of below points
How to plan my storage? what to use disks or RAID for NN datanodes? How to plan my CPU? How to plan my memory? How to plan the network bandwidth?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache HBase