Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

non-Intel standard - chipsets?

non-Intel standard - chipsets?

New Contributor

I would love a few answers that are thoughtful and consider the nature of modern Hadoop - a world where Datanodes run with 512GB ram and 1TB of SSD is not far off - additionally, I am asking this on behalf of another interested party and I am guessing there may be a few others who could get interested in the following thread: There is a Jira: https://issues.apache.org/jira/browse/HADOOP-12008 that indicates someone was thinking that implementing support for SPARC (not Spark) chipsets might happen some day. What is the likelihood that Hadoop will be implemented to support and exploit the various chipsets that exist in the wild? Is most of the Hadoop stack implemented in Java - thus making this moot as Java runs anywhere? 1) Where are the tricky bits that prevent someone from leveraging the latest open SPARC standards and things like Intel AVX or MMX extensions? 2) What is the likely timeline for supporting and exploiting these architectures within the Hadoop ecosystem? 3) Is the idea of leveraging such exotic things out of sync with the future of Hadoop due to it's commodity roots?

Thanks community!

1 REPLY 1
Highlighted

Re: non-Intel standard - chipsets?

Contributor
[ Is most of the Hadoop stack implemented in Java - thus making this moot as Java runs anywhere? ]

Yes. The bulk of Hadoop is written in Java and should be portable. However, as in the Jira above there will be some issues between platforms.

[ Where are the tricky bits that prevent someone from leveraging the latest open SPARC standards and things like Intel AVX or MMX extensions? ]

I cannot speak to the SPARC standards. For Intel you will need to look how your specific JVM supports SIMD. My question for you is, how would you like to see these accelerations leveraged. As part of Hadoop ( YARN or HDFS ) or as part of a YARN job?

[What is the likely timeline for supporting and exploiting these architectures within the Hadoop ecosystem?]

This is up to the community of users looking for these feature

[Is the idea of leveraging such exotic things out of sync with the future of Hadoop due to it's commodity roots?]

While I do believe Hadoop generally trends toward commodity, I do not believe that Hadoop precludes hardware acceleration. GPU computing in CPU based Hadoop clusters is a good example of extending capabilities with hardware. Also, the platform will continue to take advantage of the growing list of JVM performance features as they come available.

Don't have an account?
Coming from Hortonworks? Activate your account here