I have recently downloaded HDP 2.4 Sandbox for Spark Certification, everything is as per Datasheet except pySpark which 2.6.6 and data sheet mentions 2.7.6, So do I need to upgrade pySpark to 2.7.6 in my HDP Sandbox
HDP 2.4 comes with spark 1.6. Pyspark version is also 1.6.
Datasheet is referring to the python version not the pyspark version.
Most probably the OS default python for the hdp 2.4 is 2.6.x and then you need to install the new version 2.7.6 manually. I recommend you install it in separate folder to avoid problems with any other services already using the 2.6.x version.
I usually install anaconda python which is easy and isolated to the OS one. Here are some links that show how to do this: