Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Sequence of actions before pushing a pach at OS Level | Need suggestions

Highlighted

Sequence of actions before pushing a pach at OS Level | Need suggestions

New Contributor

The following is the description as per my knowledge. Kindly help me improvise the same.

Steps before the patch is pushed

Note : The following checks are not mandatory and can be skipped if the

administrator confirms.

"hdfs" is used to represent the HDFS Service user. If you are using another name for your

Service users, you need to substitute your Service user name in each of the su commands.

Important : If you have a secure server, you need Kerberos credentials for hdfs user access.

i) Verify the HDFS file system health :

su -hdfs -c "hdfs fsck / -files -blocks -locations > dfs-new-fsck-1.log"

You should see the feedback that the filesystem under path “ / “ is HEALTHY

ii) Run hdfs namespace and report :

a. List directories -

su -hdfs -c "hdfs dfs -ls -R / > dfs-new-lsr-1.log"

b. Open the dfs-new-lsr-l.log and confirm that you can see the file and directory

listing in the namespace.

c. Run report command to create a list of DataNodes in the cluster.

su -hdfs -c "hdfs dfsadmin -report > dfs-new-report-1.log"

d. Open the dfs-new-report file and validate the admin report.

iii) Compare the namespace report before the upgrade and after the upgrade. Verify that user files exist after upgrade.

The file names are listed below:

dfs-old-fsck-1.log < -- > dfs-new-fsck-1.log

dfs-old-lsr-1.log < -- > dfs-new-lsr-1.log

Note : You must do this comparison manually to catch all errors

iv) From the NameNode WebUI, determine if all DataNodes are up and running.

http://<namenode>:<namenodeport>;

v) If you are on a highly available HDFS cluster, go to the StandbyNameNode web UI to see

if all DataNodes are up and running:

http://<standbynamenode>:<namenodeport>;

vi) Verify that hdfs has read and write permissions.

hdfs dfs -put [input file] [output file]

hdfs dfs -cat [output file]

2. Make sure to fix all corrupt blocks/missing blocks/under replicated blocks in the cluster before

proceeding. All blocks should have healthy replicas.

hdfs fsck /

If there are any corrupt/missing blocks/under replicated blocks :

su -hdfs

The following can be executed as a script :

hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files

for hdfsfile in cat /tmp/under_replicated_files;

do echo "Fixing $hdfsfile :" ;

hadoop fs -setrep 3 $hdfsfile;

done

3.Version Checks and Memory availability has to be noted before rolling the patch :

i) Java version

ii) Python version ( 2.7.x )

iii) OpenSSL version ( v1.01, build 16 or later )

iv) other software requirements - scp , curl , tar , unzip , wget , yum , rpm

v)Check for available memory on all masters and workers :

free -m

Sequence Of Actions :

1. Stop all third party softwares.

2. Stop the postgresql and mysql.( in case external database service is being used )

3. Stop all services from the ambari UI.

4. Stop ambari agent on all nodes.

ambari-agent stop

5. Stop ambari server service as well.

ambari-server stop

6. The patch can be deployed now.

7. Start all third party services.

8. Start the postgresql and mysql.

( in case external database service is being used )

9. Start ambari Server service.

sudo su

ambari-server status

ambari-server start

10. Start ambari agent on all nodes.

ambari-agent status

ambari-agent stop

11. Start all services from the ambari UI.

Note : There are no potential chances or any vulnerabilities of data loss in the cluster during patching.

Don't have an account?
Coming from Hortonworks? Activate your account here