Support Questions
Find answers, ask questions, and share your expertise

Knox webhdfs HA configuration

Rising Star

Hi,

I'm facing a strange issue about knox configuration but cannot figure out what's wrong :

We have 2 instances of knox on our cluster (let's say on server1 and server2), and we have configured them for webhdfs HA.

Extract of our topology file :

<topology>
	<gateway>
		...
		<provider>
			<role>ha</role>
			<name>HaProvider</name>
			<enabled>true</enabled>
			<param>
				<name>WEBHDFS</name>
				<value>maxFailoverAttempts=3;failoverSleep=1000;maxRetryAttempts=300;retrySleep=1000;enabled=true</value>
			</param>
			...
		</provider>
	</gateway>
	...
	<service>
        <role>WEBHDFS</role>
        <url>http://namenode1:50070/webhdfs</url>
        <url>http://namenode2:50070/webhdfs</url>
    </service>
</topology>

The strange thing is that if we perform a webhdfs operation through knox on server1 (with curl) :

curl -s -i -k -H "Authorization: Basic dGEtMXQ3Ny1iZGF0YS1zY2g6QW50b2luZTg3MSE=" -X GET 'https://server1:8443/gateway/pam/webhdfs/v1//user/myuser/myFile.txt?op=OPEN' 

=> we get a redirect to a https URL on server1

But if we send the same request to server2 gateway :

curl -s -i -k -H "Authorization: Basic dGEtMXQ3Ny1iZGF0YS1zY2g6QW50b2luZTg3MSE=" -X GET 'https://server2:8443/gateway/pam/webhdfs/v1//user/myuser/myFile.txt?op=OPEN'

=> we get a redirect to a http URL on one datanode (port 1022)

Cannot find any difference about server1 and server2 knox configuration, so where should I look to understand how knox redirect incoming requests to webhdfs service ?

Any help will be greatly appreciated !

1 REPLY 1

Contributor

You should have something logged in the gateway.log file. If you do not see anything meaningful there you can turn up the debug log (by update the conf/gateway-log4j.properties file, uncomment log4j.logger.org.apache.hadoop.gateway=DEBUG property)