Reply
Highlighted
New Contributor
Posts: 4
Registered: ‎10-22-2013
Accepted Solution

Errors using webhdfs restful api with high availability namenode that failed over

When using curl to put data via the webhdfs restful api to a cluster with a high availability name node that has failed over there is an error message that the name node is on standby and the write fails.

 

Basically the curl put request goes directly to the standby name node and it gets a referral to a data node that also includes itself as the name node rather than the active namenode.  It would be nice the referral included the active name node so the subsequent write would work even though the original request went to the standby node.

 

We are using curl because it simpler and a lighter weight than running a full hdfs client or using flume.  Obviously we can do our own failover on the client, although that needs knowledge of name nodes.  Is there a better way to do this using the webhfds restful api?

Posts: 1,657
Kudos: 321
Solutions: 260
Registered: ‎07-31-2013

Re: Errors using webhdfs restful api with high availability namenode that failed over

You're right that you'll need to build your own failover on the client side for WebHDFS as it presently lacks HA awareness and support.

Another easier alternative is to setup and use HttpFs as the REST gateway, which is HA-aware and offers the exact same WebHDFS API and functionality.
Announcements