Support Questions

Find answers, ask questions, and share your expertise

HDP 2.3.4 services not starting up after rollback of failed upgrade to 2.6.0

Explorer

I was trying to upgrade my HDP cluster from 2.3.4 to 2.6, but I have to roll it back because the services didn't came up after 98% of the upgrade. Please note we are using sles11 linux.

# rpm -qa |grep -i ambari-server

ambari-server-2.6.0.0-267

# rpm -qa |grep -i python-2.6

python-2.6.9-0.33.1

Now I am getting the below error for all services startup after the rollback to 2.3.4.

Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py", line 35, in <module> BeforeAnyHook().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 367, in execute method(env) File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py", line 26, in hook import params File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/params.py", line 237, in <module> user_to_groups_dict[tez_user] = [proxyuser_group] File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/config_dictionary.py", line 73, in __getattr__ raise Fail("Configuration parameter '" + self.name + "' was not found in configurations dictionary!") resource_management.core.exceptions.Fail: Configuration parameter 'tez-env' was not found in configurations dictionary! Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-16592.json', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-16592.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', '']Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-START/scripts/hook.py', 'START', '/var/lib/ambari-agent/data/command-16592.json', '/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-START', '/var/lib/ambari-agent/data/structured-out-16592.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1', '']

any help is highly appreciated, as its urgent to fix this ?

3 REPLIES 3

Super Mentor

@Mridul Mishra


As we see that the startup operation is failing because somehow the ambari server is not able to find the "tez-env" configuration inside it's DB. It can happen due to some DB inconsistency.

Fail: Configuration parameter 'tez-env' was not found in configurations dictionary! 



But "tez-env" is a very simple file and we usually do not make much modification to this file, hence it will be easy to populate this "tez-env" file with some default setting and then later based on our requirement we can make the changes to it.

So please try the following:

Step1). Create a temporary file "/tmp/tez-env_payload.json" inside your ambari-server host with the following content:

  "properties": {
    "content": "\n## Tez specific configuration\nexport TEZ_CONF_DIR={{config_dir}}\n\n# Set HADOOP_HOME to point to a specific hadoop install directory\nexport HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}}\n\n# The java implementation to use.\nexport JAVA_HOME={{java64_home}}",
    "tez_user": "tez"
  }
}

(OR) for older HDP release you can use the following content for this file:

"properties" : {
"content" : "\n## Tez specific configuration\nexport TEZ_CONF_DIR={{config_dir}}\n\n# Set HADOOP_HOME to point to a specific hadoop install directory\nexport HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}}\n\n# The java implementation to use.\nexport JAVA_HOME={{java64_home}}",
"enable_heap_dump" : "false",
"heap_dump_location" : "/tmp",
"tez_user" : "tez"
}


Step2). Now run the following "config.py" script command to push the above default config to ambari usign config script.

# /var/lib/ambari-server/resources/scripts/configs.py --user=admin --password=admin --port=8080 --action=set --host=localhost --cluster=BlueprintCluster --config-type=tez-env --file=/tmp/tez-env_payload.json

.

Please make sure to replace the following:

localhost => replace with your ambari hostname

cluster name "BlueprintCluster" with your Cluster Name

.

Explorer
@Jay Kumar SenSharma

Please see the error below I am not able to proceed with upgrade and also its not completely downgraded. I am not able to proceed with anything now, just stuck, please help.

A previous upgrade did not complete. Reason: There is an existing downgrade from 2.6.0.3-8 which has not completed. This downgrade must be completed before a new upgrade or downgrade can begin. Failed on: <cluster_name>

Explorer

@Jay Kumar SenSharma

I have tried as per your inputs, but it didn't worked, even if the command's echo $? was 0.

I created the file and ran the config.py in the ambari-server host itself, but still its giving "tez-env was not found".

Could you also please suggest if anything related to python needs update in sles11, altough we are using python-2.6.9-0.33.1, I am asking because few errors happened earlier also in which I had to comment some lines of get_lzo_packages.py to make that script work in ambari else it was throwing error.