Created 09-06-2016 05:44 PM
Installing HDP-2.5.0 Clients using Ambari 2.4.0.1 fails with Warnings/Errors regarding 'Too many levels of symbolic links' when trying to populate /usr/hdp/current/[appName]/conf with configs.
This error is thrown when a targeted working directory is symlinked to a folder which is in turn symlinked back to the original targeted working directory. IE: /usr/hdp/current/[appName]/conf will be symlinked to /etc/[appName]/conf but /etc/[appName]/conf will also be symlinked back to /usr/hdp/current/[appName]/conf.
# hadoop-clients example: [b84cb1ae teal:hadoop-clients ~] # ll /usr/hdp/current/hadoop-client/conf lrwxrwxrwx. 1 root root 16 Sep 6 05:31 /usr/hdp/current/hadoop-client/conf -> /etc/hadoop/conf [b84cb1ae teal:hadoop-clients ~] # ll /etc/hadoop/conf lrwxrwxrwx. 1 root root 35 Sep 6 05:34 /etc/hadoop/conf -> /usr/hdp/current/hadoop-client/conf
# error thrown in /var/log/ambari-agent/ambari-agent.log: INFO 2016-09-06 05:43:17,682 ActionQueue.py:104 - Adding STATUS_COMMAND for component HDFS_CLIENT of service HDFS of cluster hart to the queue. INFO 2016-09-06 05:43:17,690 ActionQueue.py:104 - Adding STATUS_COMMAND for component YARN_CLIENT of service YARN of cluster hart to the queue. INFO 2016-09-06 05:43:17,699 ActionQueue.py:104 - Adding STATUS_COMMAND for component MAPREDUCE2_CLIENT of service MAPREDUCE2 of cluster hart to the queue. INFO 2016-09-06 05:43:17,707 ActionQueue.py:104 - Adding STATUS_COMMAND for component TEZ_CLIENT of service TEZ of cluster hart to the queue. INFO 2016-09-06 05:43:17,715 ActionQueue.py:104 - Adding STATUS_COMMAND for component HCAT of service HIVE of cluster hart to the queue. INFO 2016-09-06 05:43:17,723 ActionQueue.py:104 - Adding STATUS_COMMAND for component HIVE_CLIENT of service HIVE of cluster hart to the queue. INFO 2016-09-06 05:43:17,732 ActionQueue.py:104 - Adding STATUS_COMMAND for component PIG of service PIG of cluster hart to the queue. INFO 2016-09-06 05:43:17,740 ActionQueue.py:104 - Adding STATUS_COMMAND for component SQOOP of service SQOOP of cluster hart to the queue. INFO 2016-09-06 05:43:17,748 ActionQueue.py:104 - Adding STATUS_COMMAND for component ZOOKEEPER_CLIENT of service ZOOKEEPER of cluster hart to the queue. INFO 2016-09-06 05:43:17,756 ActionQueue.py:104 - Adding STATUS_COMMAND for component SPARK_CLIENT of service SPARK of cluster hart to the queue. INFO 2016-09-06 05:43:17,766 ActionQueue.py:104 - Adding STATUS_COMMAND for component SLIDER of service SLIDER of cluster hart to the queue. INFO 2016-09-06 05:43:17,774 ActionQueue.py:104 - Adding STATUS_COMMAND for component METRICS_MONITOR of service AMBARI_METRICS of cluster hart to the queue. INFO 2016-09-06 05:43:17,782 ActionQueue.py:104 - Adding STATUS_COMMAND for component HST_AGENT of service SMARTSENSE of cluster hart to the queue. INFO 2016-09-06 05:43:19,232 PythonReflectiveExecutor.py:65 - Reflective command failed with exception: Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/ambari_agent/PythonReflectiveExecutor.py", line 57, in run_file imp.load_source('__main__', script) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_client.py", line 123, in <module> HdfsClient().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute method(env) File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/hdfs_client.py", line 84, in security_status {'core-site.xml': FILE_TYPE_XML}) File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/security_commons.py", line 129, in get_params_from_filesystem configuration = ET.parse(conf_dir + os.sep + config_file) File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1182, in parse tree.parse(source, parser) File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 647, in parse source = open(source, "rb") IOError: [Errno 40] Too many levels of symbolic links: u'/usr/hdp/current/hadoop-client/conf/core-site.xml'
This is on brand new HDP-2.5.0 clusters and is repeatable. Failures occur on nodes running Clients only without additional applications. I can go through and manually resolve these issues but with the number of applications i am installing and the number of client only nodes i am creating, manual intervention is not practical.
Is this a known issue? Possibly a directory creation race condition?
Created 09-07-2016 05:30 PM
@Jonathan Hurley Sorry for the late reply but based on your lead I was able to figure out the root cause of my issue.
When I pulled the logs you mentioned I found the following:
2016-09-06 20:44:23,410 - Backing up /etc/hadoop/conf to /etc/hadoop/conf.backup if destination doesn't exist already. 2016-09-06 20:44:23,411 - Execute[('cp', '-R', '-p', '/etc/hadoop/conf', '/etc/hadoop/conf.backup')] {'not_if': 'test -e /etc/hadoop/conf.backup', 'sudo': True} 2016-09-06 20:44:23,436 - Checking if need to create versioned conf dir /etc/hadoop/2.5.0.0-1245/0 2016-09-06 20:44:23,438 - call[('ambari-python-wrap', u'/usr/bin/conf-select', 'dry-run-create', '--package', 'hadoop', '--stack-version', u'2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1} 2016-09-06 20:44:23,466 - call returned (1, '', "Sorry, user ambari is not allowed to execute '/bin/ambari-python-wrap /usr/bin/conf-select dry-run-create --package hadoop --stack-version 2.5.0.0-1245 --conf-version 0' as root on myserver.mydomain.com.") 2016-09-06 20:44:23,466 - Package hadoop will have new conf directories: 2016-09-06 20:44:23,468 - Checking if need to create versioned conf dir /etc/hadoop/2.5.0.0-1245/0 2016-09-06 20:44:23,470 - call[('ambari-python-wrap', u'/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', u'2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1} 2016-09-06 20:44:23,496 - call returned (1, '', "Sorry, user ambari is not allowed to execute '/bin/ambari-python-wrap /usr/bin/conf-select create-conf-dir --package hadoop --stack-version 2.5.0.0-1245 --conf-version 0' as root on myserver.mydomain.com.") 2016-09-06 20:44:23,496 - checked_call[('ambari-python-wrap', u'/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', u'2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False} 2016-09-06 20:44:23,524 - Could not select the directory for package hadoop. Error: Execution of 'ambari-python-wrap /usr/bin/conf-select set-conf-dir --package hadoop --stack-version 2.5.0.0-1245 --conf-version 0' returned 1. Sorry, user ambari is not allowed to execute '/bin/ambari-python-wrap /usr/bin/conf-select set-conf-dir --package hadoop --stack-version 2.5.0.0-1245 --conf-version 0' as root on myserver.mydomain.com. 2016-09-06 20:44:23,524 - Directory['/etc/hadoop/conf'] {'action': ['delete']} 2016-09-06 20:44:23,563 - Removing directory Directory['/etc/hadoop/conf'] and all its content 2016-09-06 20:44:23,593 - Link['/etc/hadoop/conf'] {'to': '/usr/hdp/current/hadoop-client/conf'} 2016-09-06 20:44:23,862 - Warning: linking to nonexistent location /usr/hdp/current/hadoop-client/conf 2016-09-06 20:44:23,862 - Creating symbolic Link['/etc/hadoop/conf'] to /usr/hdp/current/hadoop-client/conf
In particular, one line that stood out:
2016-09-06 20:44:23,466 - call returned (1, '', "Sorry, user ambari is not allowed to execute '/bin/ambari-python-wrap /usr/bin/conf-select dry-run-create --package hadoop --stack-version 2.5.0.0-1245 --conf-version 0' as root on myserver.mydomain.com.")
In my instance of HDP, I choose to run Ambari-Agent under a non-root user account called 'ambari' with sudoers permissions. I based my sudoers file off of the documentation found here:
What I didn't have included was '/bin/ambari-python-wrap' in the list of approved commands. After adding that and recreating the cluster, all of my issues were resolved. So if anyone is having similar issues in HDP-2.5.0 and is running Ambari-Agent under a non-root user account, make sure '/bin/ambari-python-wrap' is listed under your commands section in your sudoers file.
Created 09-06-2016 06:03 PM
The obvious issue is the circular symlink references. Have you created symlinks prior to running the installer?
Created 09-06-2016 06:07 PM
This seems like a bug, perhaps caused by client-only hosts.
/etc/<component>/conf -> /usr/hdp/current/hadoop-client/conf is correct.
What should have happened is that conf-select should have changed /usr/hdp/current/hadoop-client/conf to point to something like /usr/hdp/2.5.0.0-1234/hadoop/conf/0
I'm guessing that the conf-select step failed. If you could post the entire output from your client install command, that can help us determine why it failed.
Created 09-06-2016 06:14 PM
@emaxwell - Nope, I haven't tampered with any of the HDP directories prior to, or post, installation.
@Jonathan Hurley - Will do, I destroyed the cluster so will need to respin it back up. Takes a bit. Will post back in a few hours. Also is this the best venue for these logs or should I email/post them elsewhere?
Created 09-06-2016 07:09 PM
The message boards here are just fine. You can either copy/paste them in a code block or compress them and upload them directly.
What I'm looking for is something like this as part of the hadoop client install on a host with the problem:
2016-08-31 15:50:29,421 - Checking if need to create versioned conf dir /etc/hadoop/2.4.2.0-236/0 2016-08-31 15:50:29,422 - call[('ambari-python-wrap', '/usr/bin/conf-select', 'dry-run-create', '--package', 'hadoop', '--stack-version', '2.4.2.0-236', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1} 2016-08-31 15:50:29,439 - call returned (0, '/etc/hadoop/2.4.2.0-236/0', '') 2016-08-31 15:50:29,439 - Package hadoop will have new conf directories: /etc/hadoop/2.4.2.0-236/0 2016-08-31 15:50:29,439 - Checking if need to create versioned conf dir /etc/hadoop/2.4.2.0-236/0 2016-08-31 15:50:29,440 - call[('ambari-python-wrap', '/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', '2.4.2.0-236', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1} 2016-08-31 15:50:29,457 - call returned (0, '/etc/hadoop/2.4.2.0-236/0', '') ... 2016-08-31 15:50:29,492 - checked_call[('ambari-python-wrap', '/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', '2.4.2.0-236', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False} 2016-08-31 15:50:29,509 - checked_call returned (0, '/usr/hdp/2.4.2.0-236/hadoop/conf -> /etc/hadoop/2.4.2.0-236/0') 2016-08-31 15:50:29,510 - Ensuring that hadoop has the correct symlink structure
Created 09-07-2016 05:30 PM
@Jonathan Hurley Sorry for the late reply but based on your lead I was able to figure out the root cause of my issue.
When I pulled the logs you mentioned I found the following:
2016-09-06 20:44:23,410 - Backing up /etc/hadoop/conf to /etc/hadoop/conf.backup if destination doesn't exist already. 2016-09-06 20:44:23,411 - Execute[('cp', '-R', '-p', '/etc/hadoop/conf', '/etc/hadoop/conf.backup')] {'not_if': 'test -e /etc/hadoop/conf.backup', 'sudo': True} 2016-09-06 20:44:23,436 - Checking if need to create versioned conf dir /etc/hadoop/2.5.0.0-1245/0 2016-09-06 20:44:23,438 - call[('ambari-python-wrap', u'/usr/bin/conf-select', 'dry-run-create', '--package', 'hadoop', '--stack-version', u'2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1} 2016-09-06 20:44:23,466 - call returned (1, '', "Sorry, user ambari is not allowed to execute '/bin/ambari-python-wrap /usr/bin/conf-select dry-run-create --package hadoop --stack-version 2.5.0.0-1245 --conf-version 0' as root on myserver.mydomain.com.") 2016-09-06 20:44:23,466 - Package hadoop will have new conf directories: 2016-09-06 20:44:23,468 - Checking if need to create versioned conf dir /etc/hadoop/2.5.0.0-1245/0 2016-09-06 20:44:23,470 - call[('ambari-python-wrap', u'/usr/bin/conf-select', 'create-conf-dir', '--package', 'hadoop', '--stack-version', u'2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False, 'stderr': -1} 2016-09-06 20:44:23,496 - call returned (1, '', "Sorry, user ambari is not allowed to execute '/bin/ambari-python-wrap /usr/bin/conf-select create-conf-dir --package hadoop --stack-version 2.5.0.0-1245 --conf-version 0' as root on myserver.mydomain.com.") 2016-09-06 20:44:23,496 - checked_call[('ambari-python-wrap', u'/usr/bin/conf-select', 'set-conf-dir', '--package', 'hadoop', '--stack-version', u'2.5.0.0-1245', '--conf-version', '0')] {'logoutput': False, 'sudo': True, 'quiet': False} 2016-09-06 20:44:23,524 - Could not select the directory for package hadoop. Error: Execution of 'ambari-python-wrap /usr/bin/conf-select set-conf-dir --package hadoop --stack-version 2.5.0.0-1245 --conf-version 0' returned 1. Sorry, user ambari is not allowed to execute '/bin/ambari-python-wrap /usr/bin/conf-select set-conf-dir --package hadoop --stack-version 2.5.0.0-1245 --conf-version 0' as root on myserver.mydomain.com. 2016-09-06 20:44:23,524 - Directory['/etc/hadoop/conf'] {'action': ['delete']} 2016-09-06 20:44:23,563 - Removing directory Directory['/etc/hadoop/conf'] and all its content 2016-09-06 20:44:23,593 - Link['/etc/hadoop/conf'] {'to': '/usr/hdp/current/hadoop-client/conf'} 2016-09-06 20:44:23,862 - Warning: linking to nonexistent location /usr/hdp/current/hadoop-client/conf 2016-09-06 20:44:23,862 - Creating symbolic Link['/etc/hadoop/conf'] to /usr/hdp/current/hadoop-client/conf
In particular, one line that stood out:
2016-09-06 20:44:23,466 - call returned (1, '', "Sorry, user ambari is not allowed to execute '/bin/ambari-python-wrap /usr/bin/conf-select dry-run-create --package hadoop --stack-version 2.5.0.0-1245 --conf-version 0' as root on myserver.mydomain.com.")
In my instance of HDP, I choose to run Ambari-Agent under a non-root user account called 'ambari' with sudoers permissions. I based my sudoers file off of the documentation found here:
What I didn't have included was '/bin/ambari-python-wrap' in the list of approved commands. After adding that and recreating the cluster, all of my issues were resolved. So if anyone is having similar issues in HDP-2.5.0 and is running Ambari-Agent under a non-root user account, make sure '/bin/ambari-python-wrap' is listed under your commands section in your sudoers file.
Created 09-07-2016 07:30 PM
Very nice! That's exactly what I was looking for and the cause was spot-on. Perhaps Ambari shouldn't fail silently anymore. conf-select used to have a ton of issues which is why we ignored errors invoking it.