Welcome to the Cloudera Community

Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Who agreed with this topic

Stream data from Kafka to HDFS

avatar
Explorer

Hi, 

 

Looking for some advice on the best way to store streaming data from Kafka into HDFS, currently using Spark Streaming at 30m intervals creates lots of small files. I have attempted to use Hive and make use of it's compaction  jobs but it looks like this isn't supported when writing from Spark yet.

 

Any advice would be greatly appreciated.

 

Who agreed with this topic