Created 11-24-2017 12:58 PM
I was trying to upgrade my HDP cluster from 2.3.4 to 2.6, but I have to roll it back because the services didn't came up after 98% of the upgrade. Please note we are using sles11 linux.
# rpm -qa |grep -i ambari-server
ambari-server-2.6.0.0-267
# rpm -qa |grep -i python-2.6
python-2.6.9-0.33.1
Now I am getting the below error for all services startup after the rollback to 2.3.4.
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py", line 35, in <module>
BeforeAnyHook().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 367, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py", line 26, in hook
import params
File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/params.py", line 237, in <module>
user_to_groups_dict[tez_user] = [proxyuser_group]
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/config_dictionary.py", line 73, in __getattr__
raise Fail("Configuration parameter '" + self.name + "' was not found in configurations dictionary!")
resource_management.core.exceptions.Fail: Configuration parameter 'tez-env' was not found in configurations dictionary!
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-16592.json', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-16592.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', '']Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-START/scripts/hook.py', 'START', '/var/lib/ambari-agent/data/command-16592.json', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-START', '/var/lib/ambari-agent/data/structured-out-16592.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', '']
any help is highly appreciated, as its urgent to fix this ?
Created 11-24-2017 01:16 PM
As we see that the startup operation is failing because somehow the ambari server is not able to find the "tez-env" configuration inside it's DB. It can happen due to some DB inconsistency.
Fail: Configuration parameter 'tez-env' was not found in configurations dictionary!
But "tez-env" is a very simple file and we usually do not make much modification to this file, hence it will be easy to populate this "tez-env" file with some default setting and then later based on our requirement we can make the changes to it.
So please try the following:
Step1). Create a temporary file "/tmp/tez-env_payload.json" inside your ambari-server host with the following content:
"properties": { "content": "\n## Tez specific configuration\nexport TEZ_CONF_DIR={{config_dir}}\n\n# Set HADOOP_HOME to point to a specific hadoop install directory\nexport HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}}\n\n# The java implementation to use.\nexport JAVA_HOME={{java64_home}}", "tez_user": "tez" } }
(OR) for older HDP release you can use the following content for this file:
"properties" : { "content" : "\n## Tez specific configuration\nexport TEZ_CONF_DIR={{config_dir}}\n\n# Set HADOOP_HOME to point to a specific hadoop install directory\nexport HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}}\n\n# The java implementation to use.\nexport JAVA_HOME={{java64_home}}", "enable_heap_dump" : "false", "heap_dump_location" : "/tmp", "tez_user" : "tez" }
Step2). Now run the following "config.py" script command to push the above default config to ambari usign config script.
# /var/lib/ambari-server/resources/scripts/configs.py --user=admin --password=admin --port=8080 --action=set --host=localhost --cluster=BlueprintCluster --config-type=tez-env --file=/tmp/tez-env_payload.json
.
Please make sure to replace the following:
localhost => replace with your ambari hostname
cluster name "BlueprintCluster" with your Cluster Name
.
Created 11-28-2017 12:49 PM
Please see the error below I am not able to proceed with upgrade and also its not completely downgraded. I am not able to proceed with anything now, just stuck, please help.
A previous upgrade did not complete. Reason: There is an existing downgrade from 2.6.0.3-8 which has not completed. This downgrade must be completed before a new upgrade or downgrade can begin. Failed on: <cluster_name>
Created 11-24-2017 02:36 PM
I have tried as per your inputs, but it didn't worked, even if the command's echo $? was 0.
I created the file and ran the config.py in the ambari-server host itself, but still its giving "tez-env was not found".
Could you also please suggest if anything related to python needs update in sles11, altough we are using python-2.6.9-0.33.1, I am asking because few errors happened earlier also in which I had to comment some lines of get_lzo_packages.py to make that script work in ambari else it was throwing error.