- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to activate Knox HA for HDFS and WebHDFS ?
- Labels:
-
Apache Hadoop
-
Apache Knox
Created ‎10-13-2015 06:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created ‎10-13-2015 07:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you want to enable Knox High Availability (having multiple Knox instances) or do you want to enable HDFS HA within your Knox instance?
If the latter, check out the WebHdfs HA section on https://knox.apache.org/books/knox-0-6-0/user-guide.html#WebHDFS
To enable HA functionality for WebHDFS in Knox the following configuration has to be added to the topology file:
<provider> <role>ha</role> <name>HaProvider</name> <enabled>true</enabled> <param> <name>WEBHDFS</name> <value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true</value> </param> </provider>
And for the service configuration itself the additional URLs that standby nodes should be added to the list. The active URL (at the time of configuration) should ideally be added to the top of the list.
<service> <role>WEBHDFS</role> <url>http://{host1}:50070/webhdfs</url> <url>http://{host2}:50070/webhdfs</url> </service>
Let me know if that helps
Jonas 😃
Created ‎10-13-2015 07:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you want to enable Knox High Availability (having multiple Knox instances) or do you want to enable HDFS HA within your Knox instance?
If the latter, check out the WebHdfs HA section on https://knox.apache.org/books/knox-0-6-0/user-guide.html#WebHDFS
To enable HA functionality for WebHDFS in Knox the following configuration has to be added to the topology file:
<provider> <role>ha</role> <name>HaProvider</name> <enabled>true</enabled> <param> <name>WEBHDFS</name> <value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true</value> </param> </provider>
And for the service configuration itself the additional URLs that standby nodes should be added to the list. The active URL (at the time of configuration) should ideally be added to the top of the list.
<service> <role>WEBHDFS</role> <url>http://{host1}:50070/webhdfs</url> <url>http://{host2}:50070/webhdfs</url> </service>
Let me know if that helps
Jonas 😃
Created ‎10-13-2015 12:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Jonas,
Thanks for your reply. Yes we wanted to enable HDFS HA and WebHDFS HA within your Knox instance.
We did follow those steps and it works like a charm for WebHDFS.
I was wondering if there is something else to do following this documentation : http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_Knox_Gateway_Admin_Guide/content/service_... and this comment : Both WEBHDFS and NAMENODE require a tag (ha-alias) in order to work in High Availability mode.
Maxime
Created ‎10-13-2015 02:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@mlanciaux@hortonworks.com, that part of the documentation needs to be corrected. No such tag <ha_alias> exists in the topology.
Instead the NAMENODE service should have the logical name of the HA service found via dfs.nameservices in hdfs-site.xml as the value of the <url> tag.
So the topology file will have something like this,
<service> <role>NAMENODE</role> <url>my-ha-service</url> </service>
where the hdfs-site property and value look like this for example,
<property> <name>dfs.nameservices</name> <value>my-ha-service</value> </property>
