Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can Flume be used with HBase? How?

Solved Go to solution

Can Flume be used with HBase? How?

Hi,

Can anyone please explain if Flume be used with HBase and how we can use it. If Possibly with example to help me understand.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Can Flume be used with HBase? How?

I got below answer:

Apache Flume can be used with HBase using one of the two HBase sinks –

  • HBaseSink (org.apache.flume.sink.hbase.HBaseSink) supports secure HBase clusters and also the novel HBase IPC that was introduced in the version HBase 0.96.
  • AsyncHBaseSink (org.apache.flume.sink.hbase.AsyncHBaseSink) has better performance than HBase sink as it can easily make non-blocking calls to HBase.

Working of the HBaseSink –

In HBaseSink, a Flume Event is converted into HBase Increments or Puts. Serializer implements the HBaseEventSerializer which is then instantiated when the sink starts. For every event, sink calls the initialize method in the serializer which then translates the Flume Event into HBase increments and puts to be sent to HBase cluster.

Working of the AsyncHBaseSink-

AsyncHBaseSink implements the AsyncHBaseEventSerializer. The initialize method is called only once by the sink when it starts. Sink invokes the setEvent method and then makes calls to the getIncrements and getActions methods just similar to HBase sink. When the sink stops, the cleanUp method is called by the serializer.

5 REPLIES 5

Re: Can Flume be used with HBase? How?

This Apache document is good on Streaming data into Apache HBase using Apache Flume.

Highlighted

Re: Can Flume be used with HBase? How?

@Rohan Pednekar, thanks for sharing this link.

Re: Can Flume be used with HBase? How?

I got below answer:

Apache Flume can be used with HBase using one of the two HBase sinks –

  • HBaseSink (org.apache.flume.sink.hbase.HBaseSink) supports secure HBase clusters and also the novel HBase IPC that was introduced in the version HBase 0.96.
  • AsyncHBaseSink (org.apache.flume.sink.hbase.AsyncHBaseSink) has better performance than HBase sink as it can easily make non-blocking calls to HBase.

Working of the HBaseSink –

In HBaseSink, a Flume Event is converted into HBase Increments or Puts. Serializer implements the HBaseEventSerializer which is then instantiated when the sink starts. For every event, sink calls the initialize method in the serializer which then translates the Flume Event into HBase increments and puts to be sent to HBase cluster.

Working of the AsyncHBaseSink-

AsyncHBaseSink implements the AsyncHBaseEventSerializer. The initialize method is called only once by the sink when it starts. Sink invokes the setEvent method and then makes calls to the getIncrements and getActions methods just similar to HBase sink. When the sink stops, the cleanUp method is called by the serializer.

Re: Can Flume be used with HBase? How?

Mentor

Here's info on both HBase sinks in Flume along with examples https://flume.apache.org/FlumeUserGuide.html#hbasesinks

Alternatively, if you're using Phoenix, there's a connector for that https://phoenix.apache.org/flume.html

Re: Can Flume be used with HBase? How?

@Artem Ervits, thanks for sharing this link.