Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Are there any plans to include Tachyon (Alluxio) into CDH?

Highlighted

Are there any plans to include Tachyon (Alluxio) into CDH?

Rising Star

I recently learned that Spark 2.0 will include Structured Streaming that involves unlimited/forever DataFrames/DataSets. This will store the data in memory and spill to disk using Tachyon, which can store data in any number of different, underlying systems. The benefit is that we are further abstracted from the actual details of specifying how data is stored. It's already handled for us. This leaves us with just focusing on the data structures and processing. Data formats, folders, etc. are no longer a concern.

 

If anyone has information, please let me know.

 

Thanks,

Ben