Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Cant start Knox Gateway when restarting sandbox

Highlighted

Cant start Knox Gateway when restarting sandbox

New Contributor

Hi,

I am trying to restrart my sandbox. I have started Ranger services but cant restart knox..

it prompts below error in Ambari when tries to restart Knox Gateway

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 278, in <module>
    KnoxGateway().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/knox_gateway.py", line 138, in start
    setup_ranger_knox(upgrade_type=upgrade_type)
  File "/var/lib/ambari-agent/cache/common-services/KNOX/0.5.0.2.2/package/scripts/setup_ranger_knox.py", line 45, in setup_ranger_knox
    recursive_chmod=True
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 459, in action_create_on_execute
    self.action_delayed("create")
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 456, in action_delayed
    self.get_hdfs_resource_executor().action_delayed(action_name, self)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 247, in action_delayed
    self._assert_valid()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 231, in _assert_valid
    self.target_status = self._get_file_status(target)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 292, in _get_file_status
    list_status = self.util.run_command(target, 'GETFILESTATUS', method='GET', ignore_status_codes=['404'], assertable_result=False)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/providers/hdfs_resource.py", line 179, in run_command
    _, out, err = get_user_call_output(cmd, user=self.run_user, logoutput=self.logoutput, quiet=False)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py", line 61, in get_user_call_output
    raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'curl -sS -L -w '%{http_code}' -X GET 'http://sandbox.hortonworks.com:50070/webhdfs/v1/ranger/audit?op=GETFILESTATUS&user.name=hdfs' 1>/tmp/tmprY3HjG 2>/tmp/tmpK6n4W3' returned 7. curl: (7) couldn't connect to host
000															

and also

2017-04-08 12:13:57,911 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.5.0.0-1245
2017-04-08 12:13:57,913 - Checking if need to create versioned conf dir /etc/hadoop/2.5.0.0-1245/0
2017-04-08 12:13:57,925 - call[('ambari-python-wrap', '/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
2017-04-08 12:13:58,128 - call returned (1, '/etc/hadoop/2.5.0.0-1245/0 exist already', '')
2017-04-08 12:13:58,130 - checked_call[('ambari-python-wrap', '/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False}
2017-04-08 12:13:58,438 - checked_call returned (0, '')
2017-04-08 12:13:58,443 - Ensuring that hadoop has the correct symlink structure
2017-04-08 12:13:58,446 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2017-04-08 12:13:59,626 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.5.0.0-1245
2017-04-08 12:13:59,627 - Checking if need to create versioned conf dir /etc/hadoop/2.5.0.0-1245/0
2017-04-08 12:13:59,637 - call[('ambari-python-wrap', '/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
2017-04-08 12:13:59,926 - call returned (1, '/etc/hadoop/2.5.0.0-1245/0 exist already', '')
2017-04-08 12:13:59,927 - checked_call[('ambari-python-wrap', '/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False}
2017-04-08 12:14:00,047 - checked_call returned (0, '')
2017-04-08 12:14:00,051 - Ensuring that hadoop has the correct symlink structure
2017-04-08 12:14:00,052 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2017-04-08 12:14:00,064 - Group['livy'] {}
2017-04-08 12:14:00,073 - Group['spark'] {}
2017-04-08 12:14:00,074 - Group['ranger'] {}
2017-04-08 12:14:00,075 - Group['zeppelin'] {}
2017-04-08 12:14:00,077 - Group['hadoop'] {}
2017-04-08 12:14:00,078 - Group['users'] {}
2017-04-08 12:14:00,079 - Group['knox'] {}
2017-04-08 12:14:00,081 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,094 - User['storm'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,097 - User['zookeeper'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,111 - User['infra-solr'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,115 - User['oozie'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']}
2017-04-08 12:14:00,124 - User['atlas'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,131 - User['ams'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,135 - User['falcon'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']}
2017-04-08 12:14:00,145 - User['ranger'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['ranger']}
2017-04-08 12:14:00,152 - User['tez'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']}
2017-04-08 12:14:00,165 - User['zeppelin'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,169 - User['livy'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,182 - User['spark'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,185 - User['ambari-qa'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['users']}
2017-04-08 12:14:00,201 - User['flume'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,204 - User['kafka'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,214 - User['hdfs'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,221 - User['sqoop'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,224 - User['yarn'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,235 - User['mapred'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,238 - User['hbase'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,241 - User['knox'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,252 - User['hcat'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop']}
2017-04-08 12:14:00,256 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2017-04-08 12:14:00,271 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] {'not_if': '(test $(id -u ambari-qa) -gt 1000) || (false)'}
2017-04-08 12:14:00,328 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh ambari-qa /tmp/hadoop-ambari-qa,/tmp/hsperfdata_ambari-qa,/home/ambari-qa,/tmp/ambari-qa,/tmp/sqoop-ambari-qa'] due to not_if
2017-04-08 12:14:00,330 - Directory['/tmp/hbase-hbase'] {'owner': 'hbase', 'create_parents': True, 'mode': 0775, 'cd_access': 'a'}
2017-04-08 12:14:00,338 - File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}
2017-04-08 12:14:00,347 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] {'not_if': '(test $(id -u hbase) -gt 1000) || (false)'}
2017-04-08 12:14:00,409 - Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh hbase /home/hbase,/tmp/hbase,/usr/bin/hbase,/var/log/hbase,/tmp/hbase-hbase'] due to not_if
2017-04-08 12:14:00,412 - Group['hdfs'] {}
2017-04-08 12:14:00,417 - User['hdfs'] {'fetch_nonlocal_groups': True, 'groups': ['hadoop', 'hdfs']}
2017-04-08 12:14:00,420 - FS Type: 
2017-04-08 12:14:00,421 - Directory['/etc/hadoop'] {'mode': 0755}
2017-04-08 12:14:00,581 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh'] {'content': InlineTemplate(...), 'owner': 'hdfs', 'group': 'hadoop'}
2017-04-08 12:14:00,585 - Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'hdfs', 'group': 'hadoop', 'mode': 01777}
2017-04-08 12:14:00,705 - Execute[('setenforce', '0')] {'not_if': '(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo': True, 'only_if': 'test -f /selinux/enforce'}
2017-04-08 12:14:00,761 - Skipping Execute[('setenforce', '0')] due to not_if
2017-04-08 12:14:00,763 - Directory['/var/log/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'hadoop', 'mode': 0775, 'cd_access': 'a'}
2017-04-08 12:14:00,779 - Directory['/var/run/hadoop'] {'owner': 'root', 'create_parents': True, 'group': 'root', 'cd_access': 'a'}
2017-04-08 12:14:00,781 - Directory['/tmp/hadoop-hdfs'] {'owner': 'hdfs', 'create_parents': True, 'cd_access': 'a'}
2017-04-08 12:14:00,825 - File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties'] {'content': Template('commons-logging.properties.j2'), 'owner': 'hdfs'}
2017-04-08 12:14:00,837 - File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content': Template('health_check.j2'), 'owner': 'hdfs'}
2017-04-08 12:14:00,845 - File['/usr/hdp/current/hadoop-client/conf/log4j.properties'] {'content': ..., 'owner': 'hdfs', 'group': 'hadoop', 'mode': 0644}
2017-04-08 12:14:00,972 - File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties'] {'content': Template('hadoop-metrics2.properties.j2'), 'owner': 'hdfs', 'group': 'hadoop'}
2017-04-08 12:14:00,977 - File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties'] {'content': StaticFile('task-log4j.properties'), 'mode': 0755}
2017-04-08 12:14:00,982 - File['/usr/hdp/current/hadoop-client/conf/configuration.xsl'] {'owner': 'hdfs', 'group': 'hadoop'}
2017-04-08 12:14:01,028 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner': 'hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf', 'group': 'hadoop'}
2017-04-08 12:14:01,087 - File['/etc/hadoop/conf/topology_script.py'] {'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode': 0755}
2017-04-08 12:14:02,653 - Stack Feature Version Info: stack_version=2.5, version=2.5.0.0-1245, current_cluster_version=2.5.0.0-1245 -> 2.5.0.0-1245
2017-04-08 12:14:02,657 - Stack version to use is 2.5.0.0
2017-04-08 12:14:02,666 - Detected stack with version 2.5.0.0-1245, will use knox_data_dir = /usr/hdp/2.5.0.0-1245/knox/data
2017-04-08 12:14:02,686 - The hadoop conf dir /usr/hdp/current/hadoop-client/conf exists, will call conf-select on it for version 2.5.0.0-1245
2017-04-08 12:14:02,689 - Checking if need to create versioned conf dir /etc/hadoop/2.5.0.0-1245/0
2017-04-08 12:14:02,693 - call[('ambari-python-wrap', '/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1}
2017-04-08 12:14:02,948 - call returned (1, '/etc/hadoop/2.5.0.0-1245/0 exist already', '')
2017-04-08 12:14:02,949 - checked_call[('ambari-python-wrap', '/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', '2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False}
2017-04-08 12:14:03,257 - checked_call returned (0, '')
2017-04-08 12:14:03,263 - Ensuring that hadoop has the correct symlink structure
2017-04-08 12:14:03,264 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2017-04-08 12:14:03,306 - Directory['/usr/hdp/current/knox-server/data/'] {'group': 'knox', 'cd_access': 'a', 'create_parents': True, 'mode': 0755, 'owner': 'knox', 'recursive_ownership': True}
2017-04-08 12:14:04,531 - Directory['/var/log/knox'] {'group': 'knox', 'cd_access': 'a', 'create_parents': True, 'mode': 0755, 'owner': 'knox', 'recursive_ownership': True}
2017-04-08 12:14:04,558 - Directory['/var/run/knox'] {'group': 'knox', 'cd_access': 'a', 'create_parents': True, 'mode': 0755, 'owner': 'knox', 'recursive_ownership': True}
2017-04-08 12:14:04,567 - Directory['/usr/hdp/current/knox-server/conf'] {'group': 'knox', 'cd_access': 'a', 'create_parents': True, 'mode': 0755, 'owner': 'knox', 'recursive_ownership': True}
2017-04-08 12:14:04,612 - Directory['/usr/hdp/current/knox-server/conf/topologies'] {'group': 'knox', 'cd_access': 'a', 'create_parents': True, 'recursive_ownership': True, 'owner': 'knox', 'mode': 0755}
2017-04-08 12:14:04,615 - XmlConfig['gateway-site.xml'] {'owner': 'knox', 'group': 'knox', 'conf_dir': '/usr/hdp/current/knox-server/conf', 'configuration_attributes': {}, 'configurations': ...}
2017-04-08 12:14:04,742 - Generating config: /usr/hdp/current/knox-server/conf/gateway-site.xml
2017-04-08 12:14:04,744 - File['/usr/hdp/current/knox-server/conf/gateway-site.xml'] {'owner': 'knox', 'content': InlineTemplate(...), 'group': 'knox', 'mode': None, 'encoding': 'UTF-8'}
2017-04-08 12:14:04,823 - File['/usr/hdp/current/knox-server/conf/gateway-log4j.properties'] {'content': ..., 'owner': 'knox', 'group': 'knox', 'mode': 0644}
2017-04-08 12:14:04,865 - File['/usr/hdp/current/knox-server/conf/topologies/default.xml'] {'content': InlineTemplate(...), 'owner': 'knox', 'group': 'knox'}
2017-04-08 12:14:04,869 - Writing File['/usr/hdp/current/knox-server/conf/topologies/default.xml'] because it doesn't exist
2017-04-08 12:14:04,871 - Changing owner for /usr/hdp/current/knox-server/conf/topologies/default.xml from 0 to knox
2017-04-08 12:14:04,872 - Changing group for /usr/hdp/current/knox-server/conf/topologies/default.xml from 0 to knox
2017-04-08 12:14:04,889 - File['/usr/hdp/current/knox-server/conf/topologies/admin.xml'] {'content': InlineTemplate(...), 'owner': 'knox', 'group': 'knox'}
2017-04-08 12:14:04,895 - Writing File['/usr/hdp/current/knox-server/conf/topologies/admin.xml'] because it doesn't exist
2017-04-08 12:14:04,896 - Changing owner for /usr/hdp/current/knox-server/conf/topologies/admin.xml from 0 to knox
2017-04-08 12:14:04,897 - Changing group for /usr/hdp/current/knox-server/conf/topologies/admin.xml from 0 to knox
2017-04-08 12:14:04,908 - File['/usr/hdp/current/knox-server/conf/topologies/knoxsso.xml'] {'content': InlineTemplate(...), 'owner': 'knox', 'group': 'knox'}
2017-04-08 12:14:04,910 - Writing File['/usr/hdp/current/knox-server/conf/topologies/knoxsso.xml'] because it doesn't exist
2017-04-08 12:14:04,911 - Changing owner for /usr/hdp/current/knox-server/conf/topologies/knoxsso.xml from 0 to knox
2017-04-08 12:14:04,912 - Changing group for /usr/hdp/current/knox-server/conf/topologies/knoxsso.xml from 0 to knox
2017-04-08 12:14:04,914 - Execute['/usr/hdp/current/knox-server/bin/knoxcli.sh create-master --master [PROTECTED]'] {'environment': {'JAVA_HOME': '/usr/lib/jvm/java'}, 'not_if': "ambari-sudo.sh su knox -l -s /bin/bash -c 'test -f /usr/hdp/current/knox-server/data/security/master'", 'user': 'knox'}
2017-04-08 12:14:05,093 - Skipping Execute['/usr/hdp/current/knox-server/bin/knoxcli.sh create-master --master [PROTECTED]'] due to not_if
2017-04-08 12:14:05,095 - Execute['/usr/hdp/current/knox-server/bin/knoxcli.sh create-cert --hostname sandbox.hortonworks.com'] {'environment': {'JAVA_HOME': '/usr/lib/jvm/java'}, 'not_if': "ambari-sudo.sh su knox -l -s /bin/bash -c 'test -f /usr/hdp/current/knox-server/data/security/keystores/gateway.jks'", 'user': 'knox'}
2017-04-08 12:14:05,293 - Skipping Execute['/usr/hdp/current/knox-server/bin/knoxcli.sh create-cert --hostname sandbox.hortonworks.com'] due to not_if
2017-04-08 12:14:05,300 - File['/usr/hdp/current/knox-server/conf/ldap-log4j.properties'] {'content': ..., 'owner': 'knox', 'group': 'knox', 'mode': 0644}
2017-04-08 12:14:05,303 - File['/usr/hdp/current/knox-server/conf/users.ldif'] {'content': ..., 'owner': 'knox', 'group': 'knox', 'mode': 0644}
2017-04-08 12:14:05,308 - Knox: Setup ranger: command retry not enabled thus skipping if ranger admin is down !
2017-04-08 12:14:05,312 - HdfsResource['/ranger/audit'] {'security_enabled': False, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': [EMPTY], 'default_fs': 'hdfs://sandbox.hortonworks.com:8020', 'user': 'hdfs', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore', 'hdfs_site': ..., 'kinit_path_local': 'kinit', 'principal_name': [EMPTY], 'recursive_chmod': True, 'owner': 'hdfs', 'group': 'hdfs', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf', 'type': 'directory', 'action': ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse', u'/apps/falcon', u'/mr-history/done', u'/app-logs', u'/tmp'], 'mode': 0755}
2017-04-08 12:14:05,332 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L -w '"'"'%{http_code}'"'"' -X GET '"'"'http://sandbox.hortonworks.com:50070/webhdfs/v1/ranger/audit?op=GETFILESTATUS&user.name=hdfs'"'"' 1>/tmp/tmprY3HjG 2>/tmp/tmpK6n4W3''] {'logoutput': None, 'quiet': False}
2017-04-08 12:14:05,560 - call returned (7, '')

Command failed after 1 tries
1 REPLY 1

Re: Cant start Knox Gateway when restarting sandbox

Guru

Hello @Lucky_Luke,

From the error, looks like during startup Knox service is trying to access Ranger audit on HDFS and failing there. This could be due to either Ranger or HDFS not available when Knox is being started. Since the error says 'Couldn't connect to host', I'm inclined towards HDFS not responding.

Can you please make sure that both Ranger and HDFS are up & running before starting Knox?

For what it's worth, you can even try with disabling Knox Ranger audit for the time being. Go to Ambari > Knox > Advanced > ranger-knox-audit... > Disable audit to HDFS and audit to Solr. Save and restart Knox once. Now try to restart your sandbox. This will help to isolate the issue.

Hope this helps !