Support Questions

Find answers, ask questions, and share your expertise

Knox Gateway Start fail

Explorer

I re-start a one node cluster with Ambari. All service start sucessfully except Knox Gateway.

Oracle Linux 7

Apache Ambari Version 2.1.1

----

stderr: 
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 267, in <module>
  KnoxGateway().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 218, in execute
  method(env)
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 146, in start
  self.configure(env)
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 63, in configure
  knox()
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
  return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox.py", line 125, in knox
  not_if=master_secret_exist,
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
  self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
  self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
  provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 258, in action_run
  tries=self.resource.tries, try_sleep=self.resource.try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
  result = function(command, **kwargs)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
  tries=tries, try_sleep=try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
  result = _call(command, **kwargs_copy)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
  raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/usr/hdp/current/knox-server/bin/knoxcli.sh create-master --master [PROTECTED]' returned 1. Master secret is already present on disk. Please be aware that overwriting it will require updating other security artifacts.  Use --force to overwrite the existing master secret.
ERROR: Invalid Command
Unrecognized option:create-master
A fatal exception has occurred. Program will exit.
 stdout:
2015-12-22 18:46:33,793 - Directory['/var/lib/ambari-agent/data/tmp/AMBARI-artifacts/'] {'recursive': True}
2015-12-22 18:46:33,800 - File['/var/lib/ambari-agent/data/tmp/AMBARI-artifacts//jce_policy-8.zip'] {'content': DownloadSource('http://BODTEST02.bod.com.ve:8080/resources//jce_policy-8.zip')}
2015-12-22 18:46:33,801 - Not downloading the file from <a href="http://BODTEST02.bod.com.ve:8080/resources//jce_policy-8.zip">http://BODTEST02.bod.com.ve:8080/resources//jce_policy-8.zip</a>, because /var/lib/ambari-agent/data/tmp/jce_policy-8.zip already exists
2015-12-22 18:46:33,801 - Group['spark'] {'ignore_failures': False}
2015-12-22 18:46:33,801 - Group['hadoop'] {'ignore_failures': False}
2015-12-22 18:46:33,802 - Group['users'] {'ignore_failures': False}
2015-12-22 18:46:33,802 - Group['knox'] {'ignore_failures': False}
2015-12-22 18:46:33,802 - User['hive'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,803 - User['storm'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,803 - User['zookeeper'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,804 - User['oozie'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'users']}
2015-12-22 18:46:33,810 - User['atlas'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,810 - User['ams'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,811 - User['falcon'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'users']}
2015-12-22 18:46:33,811 - User['tez'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'users']}
2015-12-22 18:46:33,812 - User['accumulo'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,812 - User['mahout'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,813 - User['spark'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,813 - User['ambari-qa'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'users']}
2015-12-22 18:46:33,814 - User['flume'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,820 - User['kafka'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,821 - User['hdfs'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,821 - User['sqoop'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,822 - User['yarn'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,822 - User['mapred'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,823 - User['hbase'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,823 - User['knox'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,824 - User['hcat'] {'gid': 'hadoop', 'ignore_failures': False, 'groups': [u'hadoop']}
2015-12-22 18:46:33,830 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2015-12-22 18:46:33,831 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2015-12-22 18:46:33,901 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if
2015-12-22 18:46:33,901 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', 'recursive': True, 'mode': 0775, 'cd_access': 'a'}
2015-12-22 18:46:33,902 - File['/var/lib/ambari-agent/data/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2015-12-22 18:46:33,903 - Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] {'not_if': '(test $(id -u hbase) -gt 1000) || (false)'}
2015-12-22 18:46:33,913 - Skipping Execute['/var/lib/ambari-agent/data/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] due to not_if
2015-12-22 18:46:33,913 - Group['hdfs'] {'ignore_failures': False}
2015-12-22 18:46:33,913 - User['hdfs'] {'ignore_failures': False, 'groups': [u'hadoop', u'hdfs']}
2015-12-22 18:46:33,914 - Directory['/etc/hadoop'] {'mode': 0755}
2015-12-22 18:46:33,939 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2015-12-22 18:46:33,969 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'}
2015-12-22 18:46:34,011 - Skipping Execute[('setenforce', '0')] due to not_if
2015-12-22 18:46:34,011 - Directory['/var/log/hadoop'] {'owner': 'root', 'mode': 0775, 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}
2015-12-22 18:46:34,013 - Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root', 'recursive': True, 'cd_access': 'a'}
2015-12-22 18:46:34,013 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'recursive': True, 'cd_access': 'a'}
2015-12-22 18:46:34,017 - File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
2015-12-22 18:46:34,026 - File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'hdfs'}
2015-12-22 18:46:34,026 - File['/usr/hdp/current/hadoop-client/conf/log4j.properties'] {'content': ..., 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
2015-12-22 18:46:34,045 - File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties'] {'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs'}
2015-12-22 18:46:34,045 - File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755}
2015-12-22 18:46:34,046 - File['/usr/hdp/current/hadoop-client/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'}
2015-12-22 18:46:34,056 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop'}
2015-12-22 18:46:34,075 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755}
2015-12-22 18:46:34,459 - Directory['/var/lib/knox/data'] {'owner': 'knox', 'group': 'knox', 'recursive': True}
2015-12-22 18:46:34,470 - Directory['/var/log/knox'] {'owner': 'knox', 'group': 'knox', 'recursive': True}
2015-12-22 18:46:34,470 - Directory['/var/run/knox'] {'owner': 'knox', 'group': 'knox', 'recursive': True}
2015-12-22 18:46:34,480 - Directory['/usr/hdp/current/knox-server/conf'] {'owner': 'knox', 'group': 'knox', 'recursive': True}
2015-12-22 18:46:34,488 - Directory['/usr/hdp/current/knox-server/conf/topologies'] {'owner': 'knox', 'group': 'knox', 'recursive': True}
2015-12-22 18:46:34,488 - XmlConfig['gateway-site.xml'] {'owner': 'knox', 'group': 'knox', 'conf_dir': '/usr/hdp/current/knox-server/conf', 'configuration_attributes': {}, 'configurations': ...}
2015-12-22 18:46:34,538 - Generating config: /usr/hdp/current/knox-server/conf/gateway-site.xml
2015-12-22 18:46:34,538 - File['/usr/hdp/current/knox-server/conf/gateway-site.xml'] {'owner': 'knox', 'content': InlineTemplate(...), 'group': 'knox', 'mode': None, 'encoding': 'UTF-8'}
2015-12-22 18:46:34,556 - Writing File['/usr/hdp/current/knox-server/conf/gateway-site.xml'] because contents don't match
2015-12-22 18:46:34,557 - File['/usr/hdp/current/knox-server/conf/gateway-log4j.properties'] {'content': ..., 'owner': 'knox', 'group': 'knox', 'mode': 0644}
2015-12-22 18:46:34,568 - File['/usr/hdp/current/knox-server/conf/topologies/default.xml'] {'content': InlineTemplate(...), 'owner': 'knox', 'group': 'knox'}
2015-12-22 18:46:34,568 - Execute[('chown', '-R', u'knox:knox', '/var/lib/knox/data', '/var/log/knox', u'/var/run/knox', '/usr/hdp/current/knox-server/conf', '/usr/hdp/current/knox-server/conf/topologies')] {'sudo': True}
2015-12-22 18:46:34,582 - Execute['/usr/hdp/current/knox-server/bin/knoxcli.sh create-master --master [PROTECTED]'] {'environment': {'JAVA_HOME': u'/usr/jdk64/jdk1.8.0_40'}, 'not_if': "ambari-sudo.sh su knox -l -s /bin/bash -c 'test -f /var/lib/knox/data/security/master'", 'user': 'knox'}
1 ACCEPTED SOLUTION

New Contributor

Looking at the command that causes the error:

2016-01-15 15:37:00,440 - Execute['/usr/hdp/current/knox-server/bin/knoxcli.sh create-master --master [PROTECTED]'] {'environment': {'JAVA_HOME': u'/usr/jdk64/jdk1.7.0_67'}, 'not_if': "ambari-sudo.sh su knox -l -s /bin/bash -c 'test -f /var/lib/knox/data/security/master'", 'user': 'knox'}

Originates to /var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox.py

cmd = format('{knox_client_bin} create-master --master {knox_master_secret!p}')
master_secret_exist = as_user(format('test -f {knox_master_secret_path}'), params.knox_user)
Execute(cmd,user=params.knox_user,environment={'JAVA_HOME': params.java_home},not_if=master_secret_exist,)

{knox_master_secret_path} resolves to /var/lib/knox/data/security/master (as defined in /var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/params_linux.py). The problem is that the Knox master file does not exist on this location. The directory /var/lib/knox/data does exist, but the content is empty.

Instead, the master key is located here: /usr/hdp/current/knox-server/data/security/master

In the file /var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py, I also see something with removing/setting symbolic links:

# Used to setup symlink, needed to update the knox managed symlink, in case of custom locations
if os.path.islink(params.knox_managed_pid_symlink) and os.path.realpath(params.knox_managed_pid_symlink) != params.knox_pid_dir:
os.unlink(params.knox_managed_pid_symlink)
os.symlink(params.knox_pid_dir, params.knox_managed_pid_symlink)

Perhaps something goes wrong with the symbolic links? (when you install HDP2.3 successfully, but try to restart all services immediately after the installation?)

----

In any case, the following modification resolved the issue for me.. I'm not sure if it covers everything (e.g. what will happen if you change the Knox master key via the Ambari web interface??), but I don't have any more time to be stuck on this issue 🙂

Open /var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/params_linux.py

Change:

knox_master_secret_path = '/var/lib/knox/data/security/master' 
knox_cert_store_path = '/var/lib/knox/data/security/keystores/gateway.jks'

to:

knox_master_secret_path = '/usr/hdp/current/knox-server/data/security/master' 
knox_cert_store_path = '/usr/hdp/current/knox-server/data/security/keystores/gateway.jks'

Looking forward to a response from a Hortonworks employee..

View solution in original post

20 REPLIES 20

@JOSE GUILLEN

@Kevin Minder

create-master --master [PROTECTED]' returned 1.Master secret is already present on disk.Please be aware that overwriting it will require updating other security artifacts.Use--force to overwrite the existing master secret.

ERROR:InvalidCommand

Unrecognized option:create-master

Explorer

Thanks Neeraj Sabharwal, but I don´t understand what you proposed. When I installed the cluster, everything was OK, including Knox. I stop all the services through Ambari, I restarted the server node, and start up all services, everything OK, except Knox.

@Kevin Minder

@JOSE GUILLEN Java was upgraded? See Kevin's comment

Unfortunately HDP 2.2 was not certified with jdk1.8 and Knox 0.5.0.2.2 in particular has an issue with a keystore API change in jdk 1.8 that prevents it from starting. The only solution is to either upgrade HDP or downgrade the jdk.

Explorer

Thanks @Kevin Minder, but I didn't install HDP 2.2, I have installed HDP-2.3.2.0-2950 the latest versión available as I know. So if the problema persist in this version (2.3), according to you the only solution is downgrade the JDK.

Explorer

I have the same problem or it at least sounds the same.

Had a HDP 2.3 cluster up and running with everything working fine. Then installed Ranger and Ranger KMS (via Actions->Add Service in Ambari 2.3 - so also standard Ranger and Ranger KMS version). Since then Knox is not coming up again and throwing this error:

File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call tries=tries, try_sleep=try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call raise Fail(err_msg) resource_management.core.exceptions.Fail: Execution of '/usr/hdp/current/knox-server/bin/knoxcli.sh create-master --master [PROTECTED]' returned 1. Master secret is already present on disk. Please be aware that overwriting it will require updating other security artifacts. Use --force to overwrite the existing master secret. ERROR: Invalid Command Unrecognized option:create-master A fatal exception has occurred. Program will exit.

As per stack and versions info from Ambari Knox is at version 0.6.0.2.3.

For Ranger and Ranger KMS I also used the standard version. Ranger and Ranger KMS work fine...

Can anyone help ?

Any chance you can capture/provide the gateway.log file content from the period of time during which this occurred?

Explorer

Hi Kevin,

I just tried to restart Knox but error still comes up. The gateway.log from /var/log/knox attached. the change date is 8th jan though so seems no updates. 2016-01-08 15:21:37,588 INFO hadoop.gateway (GatewayConfigImpl.java:loadConfigResource(280)) - Loading configuration resource jar:file:/usr/hdp/2.3.4.0-3485/knox/bin/../lib/gateway-server-0.6.0.2.3.4.0-3485.jar!/conf/gateway-default.xml 2016-01-08 15:21:37,596 INFO hadoop.gateway (GatewayConfigImpl.java:loadConfigFile(268)) - Loading configuration file /usr/hdp/2.3.4.0-3485/knox/bin/../conf/gateway-site.xml 2016-01-08 15:21:37,614 INFO hadoop.gateway (GatewayConfigImpl.java:initGatewayHomeDir(212)) - Using /usr/hdp/2.3.4.0-3485/knox/bin/.. as GATEWAY_HOME via system property. 2016-01-08 15:21:38,104 INFO hadoop.gateway (JettySSLService.java:init(89)) - Credential store for the gateway instance found - no need to create one. 2016-01-08 15:21:38,105 INFO hadoop.gateway (JettySSLService.java:init(106)) - Keystore for the gateway instance found - no need to create one. 2016-01-08 15:21:38,107 INFO hadoop.gateway (JettySSLService.java:logAndValidateCertificate(128)) - The Gateway SSL certificate is issued to hostname: hadoop0.local. 2016-01-08 15:21:38,108 INFO hadoop.gateway (JettySSLService.java:logAndValidateCertificate(131)) - The Gateway SSL certificate is valid between: 1/8/16 3:21 PM and 1/7/17 3:21 PM. 2016-01-08 15:21:38,125 INFO hadoop.gateway (GatewayServer.java:startGateway(219)) - Starting gateway... 2016-01-08 15:21:38,340 INFO hadoop.gateway (GatewayServer.java:start(311)) - Loading topologies from directory: /usr/hdp/2.3.4.0-3485/knox/bin/../conf/topologies 2016-01-08 15:21:38,429 INFO hadoop.gateway (GatewayServer.java:handleCreateDeployment(432)) - Deploying topology admin to /usr/hdp/2.3.4.0-3485/knox/bin/../data/deployments/admin.war.151a9b47638 2016-01-08 15:21:38,429 INFO hadoop.gateway (DeploymentFactory.java:createDeployment(85)) - Configured services directory is /usr/hdp/2.3.4.0-3485/knox/bin/../data/services 2016-01-08 15:21:38,667 INFO hadoop.gateway (DefaultGatewayServices.java:initializeContribution(176)) - Creating credential store for the cluster: admin 2016-01-08 15:21:40,209 INFO hadoop.gateway (GatewayServer.java:handleCreateDeployment(432)) - Deploying topology default to /usr/hdp/2.3.4.0-3485/knox/bin/../data/deployments/default.war.152219d29b0 2016-01-08 15:21:40,209 INFO hadoop.gateway (DeploymentFactory.java:createDeployment(85)) - Configured services directory is /usr/hdp/2.3.4.0-3485/knox/bin/../data/services 2016-01-08 15:21:40,253 INFO hadoop.gateway (DefaultGatewayServices.java:initializeContribution(176)) - Creating credential store for the cluster: default 2016-01-08 15:21:41,199 INFO hadoop.gateway (GatewayServer.java:start(315)) - Monitoring topologies in directory: /usr/hdp/2.3.4.0-3485/knox/bin/../conf/topologies 2016-01-08 15:21:41,200 INFO hadoop.gateway (GatewayServer.java:startGateway(232)) - Started gateway on port 8,443.

Explorer

the knoxcli.log has a date as per the event occured with content as follows: 2016-01-13 13:22:41,083 INFO hadoop.gateway (GatewayConfigImpl.java:loadConfigResource(280)) - Loading configuration resource jar:file:/usr/hdp/2.3.4.0-3485/knox/bin/../lib/gateway-server-0.6.0.2.3.4.0-3485.jar!/conf/gateway-default.xml 2016-01-13 13:22:41,090 INFO hadoop.gateway (GatewayConfigImpl.java:loadConfigFile(268)) - Loading configuration file /usr/hdp/2.3.4.0-3485/knox/bin/../conf/gateway-site.xml 2016-01-13 13:22:41,114 INFO hadoop.gateway (GatewayConfigImpl.java:initGatewayHomeDir(212)) - Using /usr/hdp/2.3.4.0-3485/knox/bin/.. as GATEWAY_HOME via system property.

New Contributor

Hello,

I'm having the exact same issue when installing the latest version of HDP2.3 a couple of hours ago.

During the ambari-server setup installation process, I chose to install the Oracle JDK 1.8 (jdk1.8.0_40).

Will try the installation again tomorrow from scratch with Oracle JDK 1.7.

Cedric

New Contributor

Installing with Oracle JDK 1.7 instead of JDK 1.8 did not make a difference.

Result of Knox gateway start:

2016-01-15 15:37:00,354 - Directory['/var/lib/knox/data'] {'owner': 'knox', 'group': 'knox', 'recursive': True}
2016-01-15 15:37:00,357 - Directory['/var/log/knox'] {'owner': 'knox', 'group': 'knox', 'recursive': True}
2016-01-15 15:37:00,357 - Directory['/var/run/knox'] {'owner': 'knox', 'group': 'knox', 'recursive': True}
2016-01-15 15:37:00,358 - Directory['/usr/hdp/current/knox-server/conf'] {'owner': 'knox', 'group': 'knox', 'recursive': True}
2016-01-15 15:37:00,359 - Directory['/usr/hdp/current/knox-server/conf/topologies'] {'owner': 'knox', 'group': 'knox', 'recursive': True}
2016-01-15 15:37:00,359 - XmlConfig['gateway-site.xml'] {'owner': 'knox', 'group': 'knox', 'conf_dir': '/usr/hdp/current/knox-server/conf', 'configuration_attributes': {}, 'configurations': ...}
2016-01-15 15:37:00,386 - Generating config: /usr/hdp/current/knox-server/conf/gateway-site.xml
2016-01-15 15:37:00,387 - File['/usr/hdp/current/knox-server/conf/gateway-site.xml'] {'owner': 'knox', 'content': InlineTemplate(...), 'group': 'knox', 'mode': None, 'encoding': 'UTF-8'}
2016-01-15 15:37:00,399 - Writing File['/usr/hdp/current/knox-server/conf/gateway-site.xml'] because contents don't match
2016-01-15 15:37:00,400 - File['/usr/hdp/current/knox-server/conf/gateway-log4j.properties'] {'content': '...', 'owner': 'knox', 'group': 'knox', 'mode': 0644}
2016-01-15 15:37:00,411 - File['/usr/hdp/current/knox-server/conf/topologies/default.xml'] {'content': InlineTemplate(...), 'owner': 'knox', 'group': 'knox'}
2016-01-15 15:37:00,412 - Execute['('chown', '-R', u'knox:knox', '/var/lib/knox/data', '/var/log/knox', u'/var/run/knox', '/usr/hdp/current/knox-server/conf', '/usr/hdp/current/knox-server/conf/topologies')'] {'sudo': True}
2016-01-15 15:37:00,440 - Execute['/usr/hdp/current/knox-server/bin/knoxcli.sh create-master --master [PROTECTED]'] {'environment': {'JAVA_HOME': u'/usr/jdk64/jdk1.7.0_67'}, 'not_if': "ambari-sudo.sh su knox -l -s /bin/bash -c 'test -f /var/lib/knox/data/security/master'", 'user': 'knox'}
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 264, in <module>
    KnoxGateway().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 218, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 146, in start
    self.configure(env)
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 63, in configure
    knox()
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_family_impl.py", line 89, in thunk
    return fn(*args, **kwargs)
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox.py", line 125, in knox
    not_if=master_secret_exist,
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 157, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 258, in action_run
    tries=self.resource.tries, try_sleep=self.resource.try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
    result = function(command, **kwargs)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
    tries=tries, try_sleep=try_sleep)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
    result = _call(command, **kwargs_copy)
  File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
    raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/usr/hdp/current/knox-server/bin/knoxcli.sh create-master --master [PROTECTED]' returned 1. Master secret is already present on disk. Please be aware that overwriting it will require updating other security artifacts.  Use --force to overwrite the existing master secret.
ERROR: Invalid Command
Unrecognized option:create-master
A fatal exception has occurred. Program will exit.

New Contributor

Looking at the command that causes the error:

2016-01-15 15:37:00,440 - Execute['/usr/hdp/current/knox-server/bin/knoxcli.sh create-master --master [PROTECTED]'] {'environment': {'JAVA_HOME': u'/usr/jdk64/jdk1.7.0_67'}, 'not_if': "ambari-sudo.sh su knox -l -s /bin/bash -c 'test -f /var/lib/knox/data/security/master'", 'user': 'knox'}

Originates to /var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox.py

cmd = format('{knox_client_bin} create-master --master {knox_master_secret!p}')
master_secret_exist = as_user(format('test -f {knox_master_secret_path}'), params.knox_user)
Execute(cmd,user=params.knox_user,environment={'JAVA_HOME': params.java_home},not_if=master_secret_exist,)

{knox_master_secret_path} resolves to /var/lib/knox/data/security/master (as defined in /var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/params_linux.py). The problem is that the Knox master file does not exist on this location. The directory /var/lib/knox/data does exist, but the content is empty.

Instead, the master key is located here: /usr/hdp/current/knox-server/data/security/master

In the file /var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py, I also see something with removing/setting symbolic links:

# Used to setup symlink, needed to update the knox managed symlink, in case of custom locations
if os.path.islink(params.knox_managed_pid_symlink) and os.path.realpath(params.knox_managed_pid_symlink) != params.knox_pid_dir:
os.unlink(params.knox_managed_pid_symlink)
os.symlink(params.knox_pid_dir, params.knox_managed_pid_symlink)

Perhaps something goes wrong with the symbolic links? (when you install HDP2.3 successfully, but try to restart all services immediately after the installation?)

----

In any case, the following modification resolved the issue for me.. I'm not sure if it covers everything (e.g. what will happen if you change the Knox master key via the Ambari web interface??), but I don't have any more time to be stuck on this issue 🙂

Open /var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/params_linux.py

Change:

knox_master_secret_path = '/var/lib/knox/data/security/master' 
knox_cert_store_path = '/var/lib/knox/data/security/keystores/gateway.jks'

to:

knox_master_secret_path = '/usr/hdp/current/knox-server/data/security/master' 
knox_cert_store_path = '/usr/hdp/current/knox-server/data/security/keystores/gateway.jks'

Looking forward to a response from a Hortonworks employee..

As of HDP 2.2.4, I believe that Knox changed its data folder to be a symlink to a versioned folder, so

/usr/hdp/current/knox-server/data -> /var/lib/knox/data/${version} (or something like this)

This means that Knox in Ambari should always follow the symlink to get the right path instead of trying to access /var/lib/knox/data directly.

I opened AMBARI-14726 to track a fix.

Actually, that issue is fixed in Ambari 2.1.2 by https://issues.apache.org/jira/browse/AMBARI-12979

New Contributor

OK. I didn't realize that I didn't install the latest version.

Since my goal is to obtain the HDPCA certificate, I used the reference links from http://hortonworks.com/training/class/hdp-certified-administrator-hdpca-exam/. Unfortunately they are not up to date.

I guess I will also be trying the upgrade process then.. 😛

Explorer

Thank you Cedric. The fix you provided also worked for me immediately.

Open /var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/params_linux.py Change:

  1. knox_master_secret_path ='/var/lib/knox/data/security/master'
  2. knox_cert_store_path ='/var/lib/knox/data/security/keystores/gateway.jks'

to:

  1. knox_master_secret_path ='/usr/hdp/current/knox-server/data/security/master'
  2. knox_cert_store_path ='/usr/hdp/current/knox-server/data/security/keystores/gateway.jks'

Explorer

As Cedric Colpaert points out in his answer, for me the best, you must change the params_linux.py file on both agent and server directories:

/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts

/var/lib/ambari-server/resources/common-services/KNOX/0.5.0.2.2/package/scripts

New Contributor

I just ran into the same issue, also with the latest version of Ambari & Knox, on CentOS 7 with Java 1.8. FWIW, I found that if I rm /usr/hdp/current/knox-server/data/security/master, the service starts OK from within Ambari.

If you see this error, make sure to upgrade Ambari to version 2.1.2 or higher,

https://github.com/apache/ambari/blob/release-2.1.2-rc3/ambari-server/src/main/resources/common-serv...

Contributor

Updating params_linux.py on both server and agent fixed my issue

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.