Reply
Expert Contributor
Posts: 356
Registered: ‎01-25-2017

webhdfs not directing to active namenode

Hi Community.

 

I'm upgrading my CDH from 5.13.0 to 5.16.1 and i'm using the webhdfs to copy files from hadoop to vertica.

 

Since i'm using NameNode high availability i create F5 VIP which i use to get the active name node.

 

For some reasons after the upgrade to 5.16.1 i'm start getting that error that read error in state standby intermittently  , where when i'm checking this by replacing the VIP to nodes it's working on the active namenode only which is expected.

 

I tried using the the command we are using in the F5 and i'm getting the active name node.

 

Commands i used:

 

copying using the VIP:

 

SOURCE public.Hdfs(url='http://vip:50070/webhdfs/v1

 

copying using the nodes:


SOURCE public.Hdfs(url='http://node1:50070/webhdfs/v1
SOURCE public.Hdfs(url='http://node2:50070/webhdfs/v1

 

Checking the node if it active or standby
curl -X GET http://node1:50070/jmx?qry=Hadoop:service=NameNode,name=FSNamesystem HTTP/1.0\r\n\n

Expert Contributor
Posts: 356
Registered: ‎01-25-2017

Re: webhdfs not directing to active namenode

[ Edited ]

Sending simple rest api also geeting sometimes erros:

 

curl -i  "http://mha-vip1:50070/webhdfs/v1/fawze?op=LISTSTATUS"

 

HTTP/1.1 403 Forbidden
Cache-Control: no-cache
Expires: Mon, 21 Jan 2019 19:10:28 GMT
Date: Mon, 21 Jan 2019 19:10:28 GMT
Pragma: no-cache
Expires: Mon, 21 Jan 2019 19:10:28 GMT
Date: Mon, 21 Jan 2019 19:10:28 GMT
Pragma: no-cache
Content-Type: application/json
X-FRAME-OPTIONS: SAMEORIGIN
Transfer-Encoding: chunked

{"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation category READ is not supported in state standby. Visit https://s.apache.org/sbnn-error"}}

Expert Contributor
Posts: 356
Registered: ‎01-25-2017

Re: webhdfs not directing to active namenode

Issue resolved so i'm adding how we solved it for reference:

 

We have a VIP over Hadoop name nodes (standby and active) that has a keepalive check that refers all calls to the active node.

The pool uses a monitor called Hadoop_Namenode_monitor_50070.

the monitor sends the following get request :
GET /jmx?qry=Hadoop:service=NameNode,name=FSNamesystem HTTP/1.0\r\n\n
and looks for the string: active - to determine which is the active node.

In CDH 5.16.1 the output of the JMX above has changed, it has a parameter called "NumActiveClients" which causes both active and standby nodes to return the string active.

So to solve this we changed the receive parameter of the monitor in all envs to :
\"active\"
instead of :
active

Announcements