10-26-2017 12:16 AM
What is best tool for sensor data ingestion to Cloudera Data Hub? Need to support AMQP and MQTT protocols, security access keys, device registry and two way communication.
Nifi+Kafka? Or does Cloudera have own tool?
11-01-2017 10:45 AM
We Use Kafka and Flume, together, to collect data from third party servers into CDH cluster and process with spark streaming, to obtain parquet files persisted in Hadoop as Apache Hive Tables.
Is rudimentar bot works 95% of the time. Some use cases, is better just to run batch and extract data from some ftp / NFS file systems, S3, and other, instead of using MQ systems.
05-06-2019 04:45 AM
Since the question was asked, the situation has changed. As soon as Hortonworks and Cloudera merged, NiFi became supported by Cloudera.
Shortly after the integrations with CDH were also completed, so that NiFi is now a fully supported and integrated component.
Hence the question already contains the answer: Please look into NiFi+Kafka for solving this usecase.
You may also be interested in MiNiFi, for pushing/evaluating the data on the edge.