I found CDH components version always behind HDP, for example;
CDH spark 1.6.0 vs HDP spark 1.6.2
CDH SOLR 4.xx vs HDP 5.5.2
CDH hive 1.1.0 vs HDP 1.1.2
I am wondering why CDH version is always behind HDP?
We are happy to answer your question (although we have to disagree with the premise that Apache component versions shipping in CDH are "always behind" those in HDP; for example, in their current releases, CDH is shipping HBase and Flume version numbers higher than those in HDP, and Spark shipped in CDH before it shipped in HDP, at all).
The response has two parts:
1. In CDH, component version numbers alone don't tell the full story of what's in the release. Rather, we will often backport crucial bugfixes and new features from Apache repositories into CDH releases, before they are available even to users of stock Apache releases. Thus, CDH users have access to curated innovations from the community on a steady, predictable schedule, backward compatibility is assured, and upgrades are much easier. (These backports are always documented in CDH release notes.)
2. Cloudera takes QA extremely seriously. For every CDH update, we complement the unit testing done by each upstream community with our own extensive system, integration, and endurance testing, and then run it internally with real workloads pre-release to catch any stray bugs. (Read more about this process here.) That effort takes time. If we were to simply "pass along" upstream releases very quickly without this further vetting, we wouldn't be adding much value -- why not just use stock Apache releases, then?
As for your question about unsupported features: we will only provide support for functionality that we know is production-ready for customers. On the contrary, if we were to support that functionality and yet warn customers not to use it, we'd be sending a very mixed and confusing message.