Support Questions

Find answers, ask questions, and share your expertise

Ambari error when restarting flume agent

avatar
Explorer

I have an error when restarting flume agent on one of two HDP nodes:

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/stacks/HDP/2.0.6/hooks/before-RESTART/scripts/hook.py", line 20, in <module>
    from resource_management import *
  File "/usr/lib/python2.6/site-packages/resource_management/__init__.py", line 23, in <module>
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/__init__.py", line 23, in <module>
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/__init__.py", line 25, in <module>
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/default.py", line 24, in <module>
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/__init__.py", line 23, in <module>
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 31, in <module>
  File "/usr/lib/python2.6/site-packages/ambari_commons/__init__.py", line 21, in <module>
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_check.py", line 133, in <module>
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_check.py", line 115, in __init__
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_check.py", line 112, in initialize_data 

Exception: Couldn't load 'os_family.json' file

First HDP node restats OK. Both nodes are freshly installed and simillar. What's wrong?

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Kirill Elsukov

I hate to say this ..You have to uninstall ambari-agent and reinstall...I have see this behavior few times and after lot of troubleshooting...I fixed it by following the above approach

ambari-agent stop

yum remove ambari-agent

yum install ambari-agent

ambar-agent start

View solution in original post

9 REPLIES 9

avatar
Master Mentor

@Kirill Elsukov

I hate to say this ..You have to uninstall ambari-agent and reinstall...I have see this behavior few times and after lot of troubleshooting...I fixed it by following the above approach

ambari-agent stop

yum remove ambari-agent

yum install ambari-agent

ambar-agent start

avatar
Explorer

Yep, that did help. So in my case the final result is:

  • remove all ambari* from /usr/lib/python2.6/site-packages
  • yum remove ambari-agent
  • yum install ambari-agent
  • ps -ef | grep ambari
  • kill -9 <process_id> with <process_id> of ambari_agent processes
  • ambari-agent start

Thank you very much!

avatar
Guru

I had this same issue and these 4 steps resolved it each time. (Seemed for me that flume agent would not stop only when restarting flume service after config change. If I stopped the flume agent, then made config change, then restart flume service, I did not get this issue).

avatar
Guru

NOTE: I can replicate this issue on the sandbox but not another cluster I installed. Could anyone verify if you are also getting this issue, and if it is on sandbox or not.

avatar
Explorer

I reinstalled ambari-client but it didn't help. It didn't even allow to stop Flume service:

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/FLUME/1.4.0.2.0/package/scripts/flume_handler.py", line 20, in <module>
    import flume_upgrade
  File "/var/lib/ambari-agent/cache/common-services/FLUME/1.4.0.2.0/package/scripts/flume_upgrade.py", line 24, in <module>
    from resource_management.core.logger import Logger
  File "/usr/lib/python2.6/site-packages/resource_management/__init__.py", line 23, in <module>
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/__init__.py", line 23, in <module>
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/__init__.py", line 25, in <module>
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/default.py", line 24, in <module>
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/__init__.py", line 23, in <module>
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 31, in <module>
  File "/usr/lib/python2.6/site-packages/ambari_commons/__init__.py", line 21, in <module>
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_check.py", line 133, in <module>
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_check.py", line 115, in __init__
  File "/usr/lib/python2.6/site-packages/ambari_commons/os_check.py", line 112, in initialize_data 

Exception: Couldn't load 'os_family.json' file

avatar
Master Mentor
@Kirill Elsukov

Try this

cd /usr/lib/python2.6/site-packages/

ln -s /usr/lib/ambari-agent/lib/ambari_commons ambari_commons

ln -s /usr/lib/ambari-agent/lib/resource_management resource_management

ln -s /usr/lib/ambari-agent/lib/ambari_jinja2 ambari_jinja2

ambari-agent restart

avatar
Explorer

That didn't help, I still get the errors.

I've found a solution from another simillar issue:

  • remove all ambari* from /usr/lib/python2.6/site-packages, then
  • yum remove ambari-agent
  • yum install ambari-agent

That did help, but now I can't start ambari-agent because of the folllowing error:

"Failed to start ping port listener of: [Errno 98] Address already in use"

avatar
Master Mentor
@Kirill Elsukov

That's what I shared in my initial reply

"I hate to say this ..You have to uninstall ambari-agent and reinstall...I have see this behavior few times and after lot of troubleshooting...I fixed it by following the above approach

ambari-agent stop

yum remove ambari-agent

yum install ambari-agent

ambar-agent start"

ambari-agent stop

ambari-agent start

If it does not work then do this

root@phdns02 ~]# netstat -anp | grep 8670

tcp 0 0 0.0.0.0:8670 0.0.0.0:* LISTEN 446256/python2

[root@phdns02 ~]# ps -ef | grep 446256

root 175471 170130 0 06:28 pts/1 00:00:00 grep 446256

root 446256 446248 2 Feb14 ? 03:10:42 /usr/bin/python2 /usr/lib/python2.6/site-packages/ambari_agent/main.py start --expected-hostname=phdns02.cloud.hortonworks.com

kill -9 pid

avatar
Explorer

Looks like after reinstalling ambari-agent metrics crashed. I recieve an error when trying to start it:

Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_monitor.py", line 58, in <module>
    AmsMonitor().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
  File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_monitor.py", line 37, in start
    self.configure(env) # for security
  File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/metrics_monitor.py", line 32, in configure
    import params
  File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/params.py", line 26, in <module>
    import status_params
  File "/var/lib/ambari-agent/cache/common-services/AMBARI_METRICS/0.1.0/package/scripts/status_params.py", line 37, in <module>
    kinit_path_local = functions.get_kinit_path(default('/configurations/kerberos-env/executable_search_paths', None))
NameError: name 'functions' is not defined