Support Questions

Find answers, ask questions, and share your expertise

Cloudera Manager Agent is not able to communicate with this HDFS Datanode, Impala Daemon, Hbase, and Yarn Web Server Role

avatar
Contributor

I have recently upgraded from CM 7.6.7 to CM 7.11.3 and CDP 7.1.7 SP2 to CDP 7.1.7 SP3.


HDFS Datanode, Impala Daemon, Yarn Resource Manager, and Hbase Region Server are showing unhealthy web server on Cloudera as shown below.

web-server-error.png

 

After checking one of the agents log, I found the following error.

 

 

18/Nov/2024 09:09:09 +0100] 2414 GM IMPALAD throttling_logger ERROR    Error fetching metrics at 'https://host.domain.com:25000/jsonmetrics?json'
Traceback (most recent call last):
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/monitor/generic/metric_collectors.py", line 224, in _collect_and_parse_and_return
    opened_url = urlopen_with_retry_on_authentication_errors(
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/util/url.py", line 339, in urlopen_with_retry_on_authentication_errors
    return function()
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/monitor/generic/metric_collectors.py", line 244, in _open_url
    return self._urlopen_callout(
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/util/url.py", line 129, in urlopen_with_timeout
    return opener.open(url, data, timeout)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 563, in error
    result = self._call_chain(*args)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 1244, in http_error_401
    retry = self.http_error_auth_reqed('www-authenticate',
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 1124, in http_error_auth_reqed
    return self.retry_http_digest_auth(req, authreq)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 1138, in retry_http_digest_auth
    resp = self.parent.open(req, timeout=req.timeout)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/https.py", line 388, in http_error_default
    raise e
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/https.py", line 382, in http_error_default
    return old(self, req, fp, code, msg, hdrs)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 500: Internal Server Error
[18/Nov/2024 09:09:09 +0100] 2414 MonitorDaemon-Reporter firehoses    INFO     Creating a connection to the SERVICEMONITOR.
[18/Nov/2024 09:09:09 +0100] 2414 MonitorDaemon-Reporter firehoses    INFO     Creating a connection to the HOSTMONITOR.
[18/Nov/2024 09:09:55 +0100] 2414 MonitorDaemon-Scheduler daemon       WARNING  Monitor slow to respond in readiness check: 45s GenericMonitor HDFS-DATANODE for hdfs-DATANODE-f8021b8043faaa9d9d23bf9965e6ee07
[18/Nov/2024 09:09:55 +0100] 2414 MonitorDaemon-Scheduler daemon       INFO     Monitor expired: ('GenericMonitor HDFS-DATANODE for hdfs-DATANODE-f8021b8043faaa9d9d23bf9965e6ee07',)
[18/Nov/2024 09:09:55 +0100] 2414 GM NODEMANAGER throttling_logger ERROR    Error fetching metrics at 'https://host.domain.com:61006/jmx'
Traceback (most recent call last):
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 157, in retry_http_kerberos_auth
    neg_hdr = self.generate_request_header(req, headers, neg_value)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 111, in generate_request_header
    result = k.authGSSClientStep(self.context, neg_value)
kerberos.GSSError: (('Unspecified GSS failure.  Minor code may provide more information', 851968), ('Cryptosystem internal error', -1765328206))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/monitor/generic/metric_collectors.py", line 224, in _collect_and_parse_and_return
    opened_url = urlopen_with_retry_on_authentication_errors(
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/util/url.py", line 339, in urlopen_with_retry_on_authentication_errors
    return function()
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/monitor/generic/metric_collectors.py", line 244, in _open_url
    return self._urlopen_callout(
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/util/url.py", line 129, in urlopen_with_timeout
    return opener.open(url, data, timeout)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 563, in error
    result = self._call_chain(*args)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 228, in http_error_401
    retry = self.http_error_auth_reqed(host, req, headers)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 149, in http_error_auth_reqed
    return self.retry_http_kerberos_auth(req, headers, neg_value)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 174, in retry_http_kerberos_auth
    log.critical("GSSAPI Error: %s/%s" % (e[0][0], e[1][0]))
TypeError: 'GSSError' object is not subscriptable
[18/Nov/2024 09:09:55 +0100] 2414 GM DATANODE throttling_logger ERROR    Error fetching metrics at 'https://host.domain.com:9865/jmx'
Traceback (most recent call last):
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 157, in retry_http_kerberos_auth
    neg_hdr = self.generate_request_header(req, headers, neg_value)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 111, in generate_request_header
    result = k.authGSSClientStep(self.context, neg_value)
kerberos.GSSError: (('Unspecified GSS failure.  Minor code may provide more information', 851968), ('Cryptosystem internal error', -1765328206))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/monitor/generic/metric_collectors.py", line 224, in _collect_and_parse_and_return
    opened_url = urlopen_with_retry_on_authentication_errors(
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/util/url.py", line 339, in urlopen_with_retry_on_authentication_errors
    return function()
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/monitor/generic/metric_collectors.py", line 244, in _open_url
    return self._urlopen_callout(
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/util/url.py", line 129, in urlopen_with_timeout
    return opener.open(url, data, timeout)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 563, in error
    result = self._call_chain(*args)
pecified GSS failure File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 228, in http_error_401
    retry = self.http_error_auth_reqed(host, req, headers)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 149, in http_error_auth_reqed
    return self.retry_http_kerberos_auth(req, headers, neg_value)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 174, in retry_http_kerberos_auth
    log.critical("GSSAPI Error: %s/%s" % (e[0][0], e[1][0]))
TypeError: 'GSSError' object is not subscriptable
[18/Nov/2024 09:09:55 +0100] 2414 GM REGIONSERVER throttling_logger ERROR    Error fetching metrics at 'https://host.domain.com:61005/jmx'
Traceback (most recent call last):
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 157, in retry_http_kerberos_auth
    neg_hdr = self.generate_request_header(req, headers, neg_value)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 111, in generate_request_header
    result = k.authGSSClientStep(self.context, neg_value)
kerberos.GSSError: (('Unspecified GSS failure.  Minor code may provide more information', 851968), ('Cryptosystem internal error', -1765328206))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/monitor/generic/metric_collectors.py", line 224, in _collect_and_parse_and_return
    opened_url = urlopen_with_retry_on_authentication_errors(
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/util/url.py", line 339, in urlopen_with_retry_on_authentication_errors
    return function()
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/monitor/generic/metric_collectors.py", line 244, in _open_url
    return self._urlopen_callout(
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/cmf/util/url.py", line 129, in urlopen_with_timeout
    return opener.open(url, data, timeout)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 563, in error
    result = self._call_chain(*args)
  File "/data/anaconda/miniconda_3.8/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 228, in http_error_401
    retry = self.http_error_auth_reqed(host, req, headers)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 149, in http_error_auth_reqed
    return self.retry_http_kerberos_auth(req, headers, neg_value)
  File "/data/cloudera/cm-agent/lib/python3.8/site-packages/urllib_kerberos/__init__.py", line 174, in retry_http_kerberos_auth
    log.critical("GSSAPI Error: %s/%s" % (e[0][0], e[1][0]))
TypeError: 'GSSError' object is not subscriptable

 

 

I have tried everything I could but no luck.

Firstly, the hosts are heart beating.
Secondly, the /etc/krb5.conf seems to be the same for other working host (Hue server host in this case). The Web Server Status issue is the same across HDFS, Hbase, Yarn, and Impala.
Thirdly, I had tried the manual kinit before but it still throw the same error.
After trying manual kinit (kinit -k -t hdfs.keytab hdfs/host.my-default-realm.com) from the latest data node process, I ran the klist command (klist -e) and got the following.

[root@host 1546506889-hdfs-DATANODE]# klist -e
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: HTTP/host@EXAMPLE-REALM.com

Valid starting Expires Service principal
18/11/24 16:59:35 19/11/24 02:59:34 krbtgt/host@EXAMPLE-REALM.COM
renew until 25/11/24 16:59:34, Etype (skey, tkt): arcfour-hmac, aes256-cts-hmac-sha1-96



Below is the configured Kerberos Encryption Types from the Cloudera Manager Console

sayebogbon_0-1731932582515.png

 

Below is part of the host /etc/krb5.conf content.

[libdefaults]
 renew_lifetime = 604800
 ticket_lifetime = 36000
 udp_preference_limit = 1
 permitted_enctypes = rc4-hmac aes256-cts aes128-cts
 default_tgs_enctypes = rc4-hmac aes256-cts aes128-cts
 default_tkt_enctypes = rc4-hmac aes256-cts aes128-cts
 default_realm = my-default-realm.com
 default_etypes = arcfour-hmac-md5
 default_etypes_des = des-cbc-crc
 allow_weak_crypto = true

 forwardable = true
 default_keytab_name = /etc/opt/quest/vas/host.keytab
[libvas]
 site-name-override = iNET-LDAP
 use-dns-srv = true
 use-tcp-only = true

 auth-helper-timeout = 60


Finally, the OS upgrade is not yet performed. We're still on RED Hat OL7.

I know you're busy but any support will be much appreciated.

Thanks,

Stephen

1 ACCEPTED SOLUTION

avatar
Contributor

I think the problem partly has to do with our Python3.8 installation. We did the installation via Anaconda.

Cloudera recommended will use yum to install the rh-python38 on our RHEL/OL7 as I mentioned in the previous message. Documentation is here: Installing Python 3.8 standard package on RHEL 7 | CDP Private Cloud. The installation resolved most of the Web Server issue. 

The Web Server issue for Impala not only has to do with Python installation but the Web Server username and password.
Below is the following action performed to resolve the Impala Web Server issue after enabling the hadoop_secure_web_ui.

WORK PERFORMED:

  • Removed the below configurations from CM UI :
  • Impala > Configuration > Catalog Server > Web Server Username 
  •  Impala > Configuration > Catalog Server > Server Web Server User Password
  • Impala > Configuration > Impala Daemon > Web Server Username
  •  Impala > Configuration >Impala Daemon > Web Server User Password
  • Impala > Configuration >Statestore > Web Server Username
  • Impala > Configuration >Statestore > Web Server User Password

 

  • Enabled "Enable Kerberos Authentication for HTTP Web-Consoles" under CM UI > Impala > Configurations

 

  • Restarted Impala Service.

Also, regarding the Impala, this Cloudera documentation was quite helpful: Configuring Impala Web UI | CDP Public Cloud

The issue is resolved now by following the instructions in the above documentation.

View solution in original post

6 REPLIES 6

avatar
Expert Contributor

Hello @sayebogbon ,

Based on the error in the log you shared:

opened_url = urlopen_with_retry_on_authentication_errors

And the klist output showing this:

Valid starting     Expires
10/11/24 23:43:47  11/11/24 09:43:47 

Looks like you need to regenerate the kerberos credentials for this host.

To do so, please stop all services on this host.

Then go to CM > Administration > Security > Kerberos credentials.

In the search bar, type the hostname and select all the principals that appear, then click the regenerate selected button.

If there are no problems, new credentials should be generated.

Restart your services and let us know if that helps.

avatar
Contributor

Apologies, that is a wrong ticket. I should have changed it. I have updated it now.
Previously, I had regenerated both keytabs and kerberos credentials many times but no luck.

Also, after I manually kinit the kerberos ticket using kinit -k -t /var/run/cloudera-scm-agent/process/1546506889-hdfs-DATANODE/hdfs.keytab HTTP/host@EXAMPLE-REALM.COM,I was able to use curl command on the datanode web url (https://fqdn:9865) and got 200 ok response. However, it's seems like Cloudera isn't able to detect the credential for some reason.
See response below.

 

 

[root@host 1546506889-hdfs-DATANODE]# curl -v -k --negotiate -u : https://host.com:9865
* About to connect() to host.com port 9865 (#0)
*   Trying xx.xx.xxx.xx...
* Connected to host.com (xx.xx.xxx.xx) port 9865 (#0)
* Initializing NSS with certpath: sql:/etc/pki/nssdb
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* skipping SSL peer certificate verification
* SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate:
*       subject: CN=host.com,OU=Technology,O=xxxx plc,L=xl,ST=xl,C=GB
*       start date: Nov 14 14:41:16 2024 GMT
*       expire date: Nov 09 14:41:16 2025 GMT
*       common name: host.com
*       issuer: CN=host.com,OU=Technology,O=xxxx plc,L=xl,ST=xl,C=GB
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: host.com:9865
> Accept: */*
>
< HTTP/1.1 401 Authentication required
< Connection: close
< Pragma: no-cache
< Strict_Transport_Security: max-age=0; includeSubDomains
< X-Content-Type-Options: nosniff
< X-FRAME-OPTIONS: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< Pragma: no-cache
< Strict_Transport_Security: max-age=0; includeSubDomains
< X-Content-Type-Options: nosniff
< X-FRAME-OPTIONS: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< WWW-Authenticate: Negotiate
< Set-Cookie: hadoop.auth=; Path=/; HttpOnly
< Cache-Control: must-revalidate,no-cache,no-store
< Content-Type: text/html;charset=iso-8859-1
< Content-Length: 447
<
* Closing connection 0
* Issue another request to this URL: 'https://host.com:9865/'
* About to connect() to host.com port 9865 (#1)
*   Trying xx.xx.xxx.xx...
* Connected to host.com (xx.xx.xxx.xx) port 9865 (#1)
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* skipping SSL peer certificate verification
* SSL connection using TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
* Server certificate:
*       subject: CN=host.com,OU=Technology,O=xxxx plc,L=xl,ST=xl,C=GB
*       start date: Nov 14 14:41:16 2024 GMT
*       expire date: Nov 09 14:41:16 2025 GMT
*       common name: host.com
*       issuer: CN=host.com,OU=Technology,O=xxxx plc,L=xl,ST=xl,C=GB
* Server auth using GSS-Negotiate with user ''
> GET / HTTP/1.1
> Authorization: Negotiate xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxg==
> User-Agent: curl/7.29.0
> Host: host.com:9865
> Accept: */*
>
< HTTP/1.1 200 OK
< Connection: close
< Date: Mon, 18 Nov 2024 17:07:11 GMT
< Cache-Control: no-cache
< Expires: Mon, 18 Nov 2024 17:07:11 GMT
< Date: Mon, 18 Nov 2024 17:07:11 GMT
< Pragma: no-cache
< Content-Type: text/html
< Strict_Transport_Security: max-age=0; includeSubDomains
< X-Content-Type-Options: nosniff
< X-FRAME-OPTIONS: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< Expires: Mon, 18 Nov 2024 17:07:11 GMT
< Date: Mon, 18 Nov 2024 17:07:11 GMT
< Pragma: no-cache
< Strict_Transport_Security: max-age=0; includeSubDomains
< X-Content-Type-Options: nosniff
< X-FRAME-OPTIONS: SAMEORIGIN
< X-XSS-Protection: 1; mode=block
< WWW-Authenticate: Negotiate xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx=
< Set-Cookie: hadoop.auth="u=HTTP&p=HTTP/host.com.COM&t=kerberos&e=17xxxxxx7&s=CaYM+xxxxxxxxxxfBXleJ0K/ObFbrjALqy/R//g="; Path=/; HttpOnly
< Last-Modified: Fri, 30 Aug 2024 16:14:30 GMT
< Accept-Ranges: bytes
< Content-Length: 1085
<
<!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta http-equiv="REFRESH" content="0;url=datanode.html" />
  <title>Hadoop Administration</title>
</head>
* Closing connection 1
</html>[root@host 1546506889-hdfs-DATANODE]#

 

 



avatar
Contributor

We got in contact with Cloudera Support and they recommended installing standard Python38 for OL7. So, we followed this documentation: 

The Web Server issue for HDFS, YARN, HBASE disappeared. However, the Web Server issue and http error for IMPALA persists.

[08/Dec/2024 07:29:15 +0000] 28735 ImpalaDaemonQueryMonitoring throttling_logger ERROR    Error fetching metrics at 'https://host-exle.com:25000/jsonmetrics?json'
Traceback (most recent call last):
  File "/opt/cloudera/cm-agent/lib/python3.8/site-packages/cmf/monitor/generic/metric_collectors.py", line 224, in _collect_and_parse_and_return
    opened_url = urlopen_with_retry_on_authentication_errors(
  File "/opt/cloudera/cm-agent/lib/python3.8/site-packages/cmf/util/url.py", line 339, in urlopen_with_retry_on_authentication_errors
    return function()
  File "/opt/cloudera/cm-agent/lib/python3.8/site-packages/cmf/monitor/generic/metric_collectors.py", line 244, in _open_url
    return self._urlopen_callout(
  File "/opt/cloudera/cm-agent/lib/python3.8/site-packages/cmf/util/url.py", line 129, in urlopen_with_timeout
    return opener.open(url, data, timeout)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 563, in error
    result = self._call_chain(*args)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 1244, in http_error_401
    retry = self.http_error_auth_reqed('www-authenticate',
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 1124, in http_error_auth_reqed
    return self.retry_http_digest_auth(req, authreq)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 1138, in retry_http_digest_auth
    resp = self.parent.open(req, timeout=req.timeout)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 531, in open
    response = meth(req, response)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 640, in http_response
    response = self.parent.error(
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 569, in error
    return self._call_chain(*args)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/opt/cloudera/cm-agent/lib/python3.8/site-packages/cmf/https.py", line 388, in http_error_default
    raise e
  File "/opt/cloudera/cm-agent/lib/python3.8/site-packages/cmf/https.py", line 382, in http_error_default
    return old(self, req, fp, code, msg, hdrs)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 500: Internal Server Error

 

avatar
Master Collaborator

Hi @sayebogbon 

Could you restart the CM agent on the hosts where Impala daemon is in bad health and also restart service monitor from CM and check it out?

Regards,

Chethan YM

 

avatar
Contributor

Hi @ChethanYM ,

Thanks for your input. We have managed to resolve the web server issue by disabling the hadoop_secure_web_ui. 

The only problem now is when we check the agent status by running systemctl status cloudera-scm-agent, it's reporting urllib.error.HTTPError: HTTP Error 401: Unauthorized as you can see below. The Cloudera support recommend I remove the /opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py   from the rh-python38 so the agent will force its self to use it's own request.py from its python package. However, when I removed it , I was unable to start the agent again. I reported this to them and they had a session with me in which I uninstall and reinstall the agent but nothing works so far.

I had installed rh-python38 on our RHE/OL7 system by following this documentation: Installing Python 3.8 standard package on RHEL 7 | CDP Private Cloud. This is python version that the agent is running on

Note: the http error is not being reported in the /var/log/cloudera-scm-agen/cloudera-scm-agent.log. It's only reported when I check the status of the agent. Also, only a few hosts (datanode, yarn, hdfs, and some other host) have the issue.

 

 

[root@host-exle ~]# systemctl status cloudera-scm-agent
● cloudera-scm-agent.service - Cloudera Manager Agent Service
Loaded: loaded (/usr/lib/systemd/system/cloudera-scm-agent.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2024-12-09 19:30:21 GMT; 13h ago
Main PID: 18725 (cmagent)
CGroup: /system.slice/cloudera-scm-agent.service
└─18725 /usr/bin/python3.8 /opt/cloudera/cm-agent/bin/cm agent

Dec 09 19:30:33 host-exle cm[18725]: return self._call_chain(*args)
Dec 09 19:30:33 host-exle cm[18725]: File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 502, in _call_chain
Dec 09 19:30:33 host-exle cm[18725]: result = func(*args)
Dec 09 19:30:33 host-exle cm[18725]: File "/opt/cloudera/cm-agent/lib/python3.8/site-packages/cmf/https.py", line 388, in http_error_default
Dec 09 19:30:33 host-exle cm[18725]: raise e
Dec 09 19:30:33 host-exle cm[18725]: File "/opt/cloudera/cm-agent/lib/python3.8/site-packages/cmf/https.py", line 382, in http_error_default
Dec 09 19:30:33 host-exle cm[18725]: return old(self, req, fp, code, msg, hdrs)
Dec 09 19:30:33 host-exle cm[18725]: File "/opt/rh/rh-python38/root/usr/lib64/python3.8/urllib/request.py", line 649, in http_error_default
Dec 09 19:30:33 host-exle cm[18725]: raise HTTPError(req.full_url, code, msg, hdrs, fp)
Dec 09 19:30:33 host-exle cm[18725]: urllib.error.HTTPError: HTTP Error 401: Unauthorized

 

 

 

avatar
Contributor

I think the problem partly has to do with our Python3.8 installation. We did the installation via Anaconda.

Cloudera recommended will use yum to install the rh-python38 on our RHEL/OL7 as I mentioned in the previous message. Documentation is here: Installing Python 3.8 standard package on RHEL 7 | CDP Private Cloud. The installation resolved most of the Web Server issue. 

The Web Server issue for Impala not only has to do with Python installation but the Web Server username and password.
Below is the following action performed to resolve the Impala Web Server issue after enabling the hadoop_secure_web_ui.

WORK PERFORMED:

  • Removed the below configurations from CM UI :
  • Impala > Configuration > Catalog Server > Web Server Username 
  •  Impala > Configuration > Catalog Server > Server Web Server User Password
  • Impala > Configuration > Impala Daemon > Web Server Username
  •  Impala > Configuration >Impala Daemon > Web Server User Password
  • Impala > Configuration >Statestore > Web Server Username
  • Impala > Configuration >Statestore > Web Server User Password

 

  • Enabled "Enable Kerberos Authentication for HTTP Web-Consoles" under CM UI > Impala > Configurations

 

  • Restarted Impala Service.

Also, regarding the Impala, this Cloudera documentation was quite helpful: Configuring Impala Web UI | CDP Public Cloud

The issue is resolved now by following the instructions in the above documentation.