Hi, I am new to cloudera envelope. I noticed that as output it does not support hbase. I am under the impression that kafka supports at least once delivery mechanism. We were using hbase to handle duplicates. Is the best idea to write to hbase via hive?
Also is there any kafka offset management with this framework?
The next version of Envelope that we will be releasing on Cloudera Labs will have both an HBase output and Kafka offset management built in.
In the mean time for HBase you could try writing to it in Envelope via an Hive output, although we haven't tested that so I can't say for sure that it will work.
In the meantime, is there a way to manage offsets ourselves? When would the next version be made available for usage in production systems?
We are trying to decide whether it makes sense to leverage envelope for a production system as is or write something ourselves until envelope is ready and meets our use case.
You would need to wait for the next version to do offset management beyond just "read from the start" or "read from the end".
We are hoping to get the next version out over the next month or so but don't have a firm date. If you need it straight away then it would likely be easier to make your own enhancements to the Envelope code rather than start from scratch.
Fair question but I think only you can decide that for yourself. Cloudera Labs projects are not supported but they can be a really good way to get a big head start on writing your own implementations from scratch. Generally because they're not supported you would need to do your own testing to make yourself comfortable that they meet your production quality standards.