Reply
Champion Alumni
Posts: 196
Registered: ‎11-18-2014
Accepted Solution

Flume - HDFS HA

Hello,

 

I'm searching for the recommanded configuration for Flume - HDFS Sink when we are using HDFS in HA. 

 

In fact, each time that we restart the cluster/ the nodename fails the active nodename changes and flume fails since is

asking informations on the standby node. 

 

Thank you!

 

Alina

GHERMAN Alina
Posts: 1,565
Kudos: 287
Solutions: 239
Registered: ‎07-31-2013

Re: Flume - HDFS HA

What form of HDFS path are you configuring in your Flume agent configs?

For HA, you must use the HA service name, such as
hdfs://nameservice1/user/foo instead of
hdfs://namenode-host:8020/user/foo. This will protect your agents from
failures during HA failovers.

Backline Customer Operations Engineer
Explorer
Posts: 16
Registered: ‎01-11-2017

Re: Flume - HDFS HA

This is not useful for a remote hdfs clusters... Is possible to user webhdfs from flume?
Posts: 1,565
Kudos: 287
Solutions: 239
Registered: ‎07-31-2013

Re: Flume - HDFS HA

For remote HDFS clusters, just ensure to define the required namespace resolving configuration in your HDFS Gateway hdfs-site.xml configuration. Then in Flume you can use the remote namespace defined name. See http://community.cloudera.com/t5/Storage-Random-Access-HDFS/distcp-with-same-nameservicename/m-p/493... for more details on how to define this.
Backline Customer Operations Engineer
Announcements