Created 07-22-2024 11:37 AM
Requirement:
Building an open source Hadoop cluster. No Ambari (No support for RHEL 9) and No CM manager (My company is not ready).
Server Information:
I have 5 nodes in RHEL 9: 2 masters and 3 slaves.
Apache Services to be installed as per my company requirement:
- Hadoop
- Hbase
- Zookeeper
Online Resources:
Overwhelmed to follow because Its tough for me to figure out how I will be able to interlinked everything together.
Needed solution:
Can someone please share the standard and most updated resources which i can follow to setup my cluster. And make those services talk to each other in a right direction.
Like where to start ? As I know its hadoop but after that I got no clue.
I did all the coding part in my previous project and now this admin part is a pain to me.
Please someone help. Really appreciate.
Thanks,
Srikanth
Created 07-23-2024 10:52 PM
You can follow this if you do not need CM or ambari.
Download Hadoop: Download the latest stable release of Hadoop from the Apache Hadoop website.
tar -xzf hadoop-3.3.4.tar.gz
sudo mv hadoop-3.3.4 /usr/local/hadoop
2. Configure Hadoop Environment Variables: Add the following lines to your .bashrc or .profile file.
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
3. Edit Configuration Files: Edit the core configuration files in $HADOOP_CONF_DIR.
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master-node:9000</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/local/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/local/hadoop/hdfs/datanode</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>master-node:8032</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
4. Format the NameNode:
hdfs namenode -format
5. Start Hadoop Services:
start-dfs.sh
start-yarn.sh
Step 2: Install Zookeeper:
1. Download and Extract Zookeeper: https://downloads.apache.org/zookeeper
tar -xzf apache-zookeeper-3.8.1-bin.tar.gz
sudo mv apache-zookeeper-3.8.1-bin /usr/local/zookeeper
2. Configure Zookeeper: Create a configuration file at /usr/local/zookeeper/conf/zoo.cfg
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=master-node1:2888:3888
server.2=master-node2:2888:3888
server.3=slave-node1:2888:3888
3. Start Zookeeper:
/usr/local/zookeeper/bin/zkServer.sh start
Step 3: Install HBase:
1. Download HBase: Download the latest stable release of HBase from the Apache HBase website.
tar -xzf hbase-2.4.16-bin.tar.gz
sudo mv hbase-2.4.16 /usr/local/hbase
2. Configure HBase:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master-node:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master-node1,master-node2,slave-node1</value>
</property>
</configuration>
3. Start HBase:
/usr/local/hbase/bin/start-hbase.sh
Step 4: Verify Installation:
Check the Hadoop services using the web interfaces:
Regards,
Chethan YM
Created 07-23-2024 10:52 PM
You can follow this if you do not need CM or ambari.
Download Hadoop: Download the latest stable release of Hadoop from the Apache Hadoop website.
tar -xzf hadoop-3.3.4.tar.gz
sudo mv hadoop-3.3.4 /usr/local/hadoop
2. Configure Hadoop Environment Variables: Add the following lines to your .bashrc or .profile file.
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
3. Edit Configuration Files: Edit the core configuration files in $HADOOP_CONF_DIR.
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master-node:9000</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/local/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/local/hadoop/hdfs/datanode</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>master-node:8032</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
4. Format the NameNode:
hdfs namenode -format
5. Start Hadoop Services:
start-dfs.sh
start-yarn.sh
Step 2: Install Zookeeper:
1. Download and Extract Zookeeper: https://downloads.apache.org/zookeeper
tar -xzf apache-zookeeper-3.8.1-bin.tar.gz
sudo mv apache-zookeeper-3.8.1-bin /usr/local/zookeeper
2. Configure Zookeeper: Create a configuration file at /usr/local/zookeeper/conf/zoo.cfg
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=master-node1:2888:3888
server.2=master-node2:2888:3888
server.3=slave-node1:2888:3888
3. Start Zookeeper:
/usr/local/zookeeper/bin/zkServer.sh start
Step 3: Install HBase:
1. Download HBase: Download the latest stable release of HBase from the Apache HBase website.
tar -xzf hbase-2.4.16-bin.tar.gz
sudo mv hbase-2.4.16 /usr/local/hbase
2. Configure HBase:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master-node:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master-node1,master-node2,slave-node1</value>
</property>
</configuration>
3. Start HBase:
/usr/local/hbase/bin/start-hbase.sh
Step 4: Verify Installation:
Check the Hadoop services using the web interfaces:
Regards,
Chethan YM
Created 07-25-2024 12:23 PM
Thanks and really appreciate you for sharing the info.