Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Errors using webhdfs restful api with high availability namenode that failed over

avatar
Explorer

When using curl to put data via the webhdfs restful api to a cluster with a high availability name node that has failed over there is an error message that the name node is on standby and the write fails.

 

Basically the curl put request goes directly to the standby name node and it gets a referral to a data node that also includes itself as the name node rather than the active namenode.  It would be nice the referral included the active name node so the subsequent write would work even though the original request went to the standby node.

 

We are using curl because it simpler and a lighter weight than running a full hdfs client or using flume.  Obviously we can do our own failover on the client, although that needs knowledge of name nodes.  Is there a better way to do this using the webhfds restful api?

1 ACCEPTED SOLUTION

avatar
Mentor
You're right that you'll need to build your own failover on the client side for WebHDFS as it presently lacks HA awareness and support.

Another easier alternative is to setup and use HttpFs as the REST gateway, which is HA-aware and offers the exact same WebHDFS API and functionality.

View solution in original post

1 REPLY 1

avatar
Mentor
You're right that you'll need to build your own failover on the client side for WebHDFS as it presently lacks HA awareness and support.

Another easier alternative is to setup and use HttpFs as the REST gateway, which is HA-aware and offers the exact same WebHDFS API and functionality.