28787
DISCUSSIONS
102047
MEMBERS
3161
ARTICLES
Hi,
Looking for some advice on the best way to store streaming data from Kafka into HDFS, currently using Spark Streaming at 30m intervals creates lots of small files. I have attempted to use Hive and make use of it's compaction jobs but it looks like this isn't supported when writing from Spark yet.
Any advice would be greatly appreciated.