Support Questions

Find answers, ask questions, and share your expertise

Understanding Apache Hadoop releases

avatar
Rising Star

Hi,

Any idea how Apache Hadoop versioning works? When I go to Hadoop homepage on Apache page, it lists 2.7.2 as the latest stable release (I believe 2.7.1 is part of HDP2.4.2) But thats released in Jan 2016. But there is 2.6.4 released in Feb 2016. Which is the current branch to follow and when to use 2.6 release?

Any idea when 2.8.0 release date is?

Thanks

1 ACCEPTED SOLUTION

avatar

Hi @bigdata.neophyte, the Apache Hadoop community is actively maintaining two stable release lines:

  1. 2.7.x - The latest release in this line is 2.7.2. There should be an RC for 2.7.3 out later this month. The git branch for release 2.7.3 is branch-2.7.3. The next maintenance release on this line will be 2.7.4 and it is currently tracked by branch-2.7.
  2. 2.6.x - The latest release in this line is 2.6.4. There may be a 2.6.5 (off branch-2.6) but no release manager is actively driving it right now (any Hadoop committer can be a release manager).

The 2.8.0 release has been significantly delayed as community effort was diverted to stabilizing 2.6.x and 2.7.x. It is planned but there is no timetable for the release.

Also common-dev at hadoop.apache.org would be a good place for this question as you are likely to get responses from release managers.

View solution in original post

4 REPLIES 4

avatar

A quick way to determine specific versions of core Hadoop (and all the components making up HDP) is to visit the particular HDP version's release notes under http://docs.hortonworks.com.

When you are on a box itself, you can read the "cookie crumbs" such as shown below that does show that HDP 2.4.2 uses Hadoop 2.7.1 as you identified above. Hint: look at the long jar name that includes the Apache version number followed by the HDP version number.

[root@ip-172-30-0-91 hdp]# pwd/usr/hdp
[root@ip-172-30-0-91 hdp]# ls
2.4.2.0-258  current
[root@ip-172-30-0-91 hdp]# cd current/hadoop-hdfs-client
[root@ip-172-30-0-91 hadoop-hdfs-client]# ls hadoop-hdfs-2*
hadoop-hdfs-2.7.1.2.4.2.0-258.jar
hadoop-hdfs-2.7.1.2.4.2.0-258-tests.jars

As for a Hadoop 2.8 release date, I'm sure not the person who can comment on that, but you can go to https://issues.apache.org/jira/browse/HADOOP/fixforversion/12329058/ to see all the JIRAs that are currently slated to be part of it.

Good luck!

avatar

Hi @bigdata.neophyte, the Apache Hadoop community is actively maintaining two stable release lines:

  1. 2.7.x - The latest release in this line is 2.7.2. There should be an RC for 2.7.3 out later this month. The git branch for release 2.7.3 is branch-2.7.3. The next maintenance release on this line will be 2.7.4 and it is currently tracked by branch-2.7.
  2. 2.6.x - The latest release in this line is 2.6.4. There may be a 2.6.5 (off branch-2.6) but no release manager is actively driving it right now (any Hadoop committer can be a release manager).

The 2.8.0 release has been significantly delayed as community effort was diverted to stabilizing 2.6.x and 2.7.x. It is planned but there is no timetable for the release.

Also common-dev at hadoop.apache.org would be a good place for this question as you are likely to get responses from release managers.

avatar
Rising Star

Thanks @Arpit Agarwal for your response. Any specific reason still there are two branches maintained? Are they significantly different from one another and hence need to be tracked and maintained separately? I presume HDP and many commercial distributions follow 2.7.x lineage. So wondering who is using 2.6.x series?

Thanks in Advance.

avatar

It's common for mature software products to have parallel maintenance lines. Enterprises often delay upgrades due to regulatory/certification requirements or other reasons.