Member since
07-27-2021
3
Posts
1
Kudos Received
0
Solutions
06-19-2024
01:37 AM
1 Kudo
We are experiencing the same issue on CDP 7.1.7 calling a spark job from Oozie.
... View more
07-30-2021
05:49 AM
I was also thinking that the network will be the problem, not disk space, since the host is unknown health all the time. The server logs don't show anything unusual. The agent logs show that it is heartbeating on host: [root@cloudera ~]# netstat -an | grep -e 9000 -e 9001
tcp 0 0 10.0.2.15:9000 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9001 0.0.0.0:* LISTEN Could it be a problem that the 9001 port is open on localhost not on cloudera host(10.0.2.15)? I do not know where is that setting for port 9001 in the config file for the agent. When I try to install the host manually, this is the health inspector log: Inspect Hosts for Correctness
Validations
Inspector ran on all 1 hosts.
Individual hosts resolved their own hostnames correctly.
No errors were found while looking for conflicting init scripts.
The following errors were found while checking /etc/hosts...
View Details
In /etc/hosts on cloudera, the hostname cloudera is mapped to cloudera, whereas it should be mapped to 10.0.2.15.
All hosts resolved localhost to 127.0.0.1.
All hosts checked resolved each other's hostnames correctly and in a timely manner.
Host clocks are approximately in sync (within ten minutes).
Host time zones are consistent across the cluster.
The user hdfs is missing on the following hosts:
View Details
cloudera
The user mapred is missing on the following hosts:
View Details
The user zookeeper is missing on the following hosts:
View Details
The user oozie is missing on the following hosts:
View Details
The user hbase is missing on the following hosts:
View Details
The user hue is missing on the following hosts:
View Details
The user sqoop is missing on the following hosts:
View Details
The user impala is missing on the following hosts:
View Details
The user sentry is missing on the following hosts:
View Details
The group hdfs is missing on the following hosts:
View Details
The group mapred is missing on the following hosts:
View Details
The group zookeeper is missing on the following hosts:
View Details
The group oozie is missing on the following hosts:
View Details
The group hbase is missing on the following hosts:
View Details
The group hue is missing on the following hosts:
View Details
The group hadoop is missing on the following hosts:
View Details
The group hive is missing on the following hosts:
View Details
The group sqoop is missing on the following hosts:
View Details
The group impala is missing on the following hosts:
View Details
The group sentry is missing on the following hosts:
View Details
No conflicts detected between packages and parcels.
No kernel versions that are known to be bad are running.
No problems were found with /proc/sys/vm/swappiness on any of the hosts.
Transparent Huge Page Compaction is enabled and can cause significant performance problems. Run "echo never > /sys/kernel/mm/transparent_hugepage/defrag" and "echo never > /sys/kernel/mm/transparent_hugepage/enabled" to disable this, and then add the same command to an init script such as /etc/rc.local so it will be set on system reboot. The following hosts are affected:
View Details
cloudera
Hue Python version dependency is satisfied.
Hue Psycopg2 version for PostgreSQL is satisfied for both CDH 5 and CDH 6.
1 hosts are reporting with NONE version
All checked hosts in each cluster are running the same version of components.
All managed hosts have consistent versions of Java.
All checked Cloudera Management Daemons versions are consistent with the server.
All checked Cloudera Management Agents versions are consistent with the server. Version Summary
Hosts that do not belong to any cluster
All Hosts
cloudera
Component Version Hosts Release Version
Supervisord 3.4.0 cloudera Unavailable Not applicable
Cloudera Manager Agent 7.1.4 cloudera 6363010.el7 Not applicable
Cloudera Manager Management Daemons 7.1.4 cloudera 6363010.el7 Not applicable
Crunch (CDH 5 only) Unavailable cloudera Unavailable Not installed or path is incorrect
flume Unavailable cloudera Unavailable Not installed or path is incorrect
... View more
07-28-2021
03:05 AM
Hi @carrossoni, I have been having the same errors as @AkhilTech, but even before that, during the script execution I don't have the local parcels on the VM: default: -- Install CSDs
default: mv: cannot stat ‘/root/*.jar’: No such file or directory
default: mv: cannot stat ‘/home/centos/*.jar’: No such file or directory
default: chown: cannot access ‘/opt/cloudera/csd/*’: No such file or directory
default: chmod: cannot access ‘/opt/cloudera/csd/*’: No such file or directory
default: -- Install local parcels
default: mv: cannot stat ‘/root/*.parcel’: No such file or directory
default: mv: cannot stat ‘/root/*.parcel.sha’: No such file or directory
default: mv: cannot stat ‘/home/centos/*.parcel’: No such file or directory
default: mv: cannot stat ‘/home/centos/*.parcel.sha’: No such file or directory
default: chown: cannot access ‘/opt/cloudera/parcel-repo/*’: No such file or directory after that the it is just waiting forever for the CM to boot. After I restarted the cloudera agent using ssh, this is the output I get. default: -- Now CM is started and the next step is to automate using the CM API
default: ./centosvmCDP.sh: line 178: [: v42: unary operator expected
default: DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
default: Requirement already up-to-date: pip in /usr/lib/python2.7/site-packages (20.3.4)
default: Requirement already up-to-date: cm_client in /usr/lib/python2.7/site-packages (41.0.1)
default: Requirement already satisfied, skipping upgrade: urllib3>=1.15 in /usr/lib/python2.7/site-packages (from cm_client) (1.26.6)
default: Requirement already satisfied, skipping upgrade: certifi in /usr/lib/python2.7/site-packages (from cm_client) (2021.5.30)
default: Requirement already satisfied, skipping upgrade: six>=1.10 in /usr/lib/python2.7/site-packages (from cm_client) (1.16.0)
default: Requirement already satisfied, skipping upgrade: python-dateutil in /usr/lib/python2.7/site-packages (from cm_client) (2.8.2)
default: {'active': True,
default: 'can_retry': False,
default: 'children': {'items': []},
default: 'cluster_ref': None,
default: 'end_time': None,
default: 'host_ref': None,
default: 'id': 13.0,
default: 'name': 'GlobalHostInstall',
default: 'parent': None,
default: 'result_data_url': None,
default: 'result_message': None,
default: 'role_ref': None,
default: 'service_ref': None,
default: 'start_time': '2021-07-28T09:50:10.176Z',
default: 'success': None}
default: {'active': False,
default: 'can_retry': True,
default: 'children': {'items': []},
default: 'cluster_ref': None,
default: 'end_time': '2021-07-28T09:50:16.181Z',
default: 'host_ref': None,
default: 'id': 13.0,
default: 'name': 'GlobalHostInstall',
default: 'parent': None,
default: 'result_data_url': 'http://cloudera:7180/cmf/command/13/download',
default: 'result_message': 'Failed to complete installation.',
default: 'role_ref': None,
default: 'service_ref': None,
default: 'start_time': '2021-07-28T09:50:10.176Z',
default: 'success': False}
default: Traceback (most recent call last):
default: File "/root/CDPDCTrial/scripts/create_cluster.py", line 76, in <module>
default: mgmt_api.auto_assign_roles() # needed?
default: File "/usr/lib/python2.7/site-packages/cm_client/apis/mgmt_service_resource_api.py", line 65, in auto_assign_roles
default: (data) = self.auto_assign_roles_with_http_info(**kwargs)
default: File "/usr/lib/python2.7/site-packages/cm_client/apis/mgmt_service_resource_api.py", line 131, in auto_assign_roles_with_http_info
default: collection_formats=collection_formats)
default: File "/usr/lib/python2.7/site-packages/cm_client/api_client.py", line 326, in call_api
default: _return_http_data_only, collection_formats, _preload_content, _request_timeout)
default: File "/usr/lib/python2.7/site-packages/cm_client/api_client.py", line 153, in __call_api
default: _request_timeout=_request_timeout)
default: File "/usr/lib/python2.7/site-packages/cm_client/api_client.py", line 379, in request
default: body=body)
default: File "/usr/lib/python2.7/site-packages/cm_client/rest.py", line 273, in PUT
default: body=body)
default: File "/usr/lib/python2.7/site-packages/cm_client/rest.py", line 219, in request
default: raise ApiException(http_resp=r)
default: cm_client.rest.ApiException: (400)
default: Reason: Bad Request
default: HTTP response headers: HTTPHeaderDict({'X-XSS-Protection': '1; mode=block', 'X-Content-Type-Options': 'nosniff', 'Transfer-Encoding': 'chunked', 'Set-Cookie': 'SESSION=ceeaa4a0-3cc5-4c44-b792-73cda7fbe71b;Path=/;HttpOnly', 'Expires': 'Thu, 01 Jan 1970 00:00:00 GMT', 'Pragma': 'no-cache', 'Cache-Control': 'no-cache, no-store, max-age=0, must-revalidate', 'Date': 'Wed, 28 Jul 2021 09:50:17 GMT', 'X-Frame-Options': 'DENY', 'Content-Type': 'application/json;charset=utf-8'})
default: HTTP response body: {
default: "message" : "Deployment should contain hosts."
default: }
default:
default: usermod: group 'hadoop' does not exist
default: sudo: unknown user: hdfs
default: sudo: unable to initialize policy plugin
default: sudo: unknown user: hdfs
default: sudo: unable to initialize policy plugin
default: sudo: unknown user: hdfs
default: sudo: unable to initialize policy plugin
default: sudo: unknown user: hdfs
default: sudo: unable to initialize policy plugin
default: sudo: unknown user: hdfs
default: sudo: unable to initialize policy plugin This is the latest agent log output: [28/Jul/2021 09:52:03 +0000] 1625 MainThread agent INFO To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels.
[28/Jul/2021 09:52:05 +0000] 1625 MainThread supervisor INFO Trying to connect to supervisor (Attempt 1)
[28/Jul/2021 09:52:05 +0000] 1625 MainThread supervisor INFO Supervisor version: 3.4.0, pid: 795
[28/Jul/2021 09:52:05 +0000] 1625 MainThread supervisor INFO Successfully connected to supervisor
[28/Jul/2021 09:52:05 +0000] 1625 MainThread agent INFO Supervisor version: 3.4.0, pid: 795
[28/Jul/2021 09:52:05 +0000] 1625 MainThread agent INFO Connecting to previous supervisor: agent-795-1627465844.
[28/Jul/2021 09:52:05 +0000] 1625 MainThread supervisor INFO Triggering supervisord update.
[28/Jul/2021 09:52:05 +0000] 1625 MainThread _cplogging INFO [28/Jul/2021:09:52:05] ENGINE Bus STARTING
[28/Jul/2021 09:52:05 +0000] 1625 MainThread _cplogging INFO [28/Jul/2021:09:52:05] ENGINE Started monitor thread '_TimeoutMonitor'.
[28/Jul/2021 09:52:06 +0000] 1625 MainThread _cplogging INFO [28/Jul/2021:09:52:06] ENGINE Serving on http://127.0.0.1:9001
[28/Jul/2021 09:52:06 +0000] 1625 MainThread _cplogging INFO [28/Jul/2021:09:52:06] ENGINE Bus STARTED
[28/Jul/2021 09:52:06 +0000] 1625 MainThread status_server INFO Status server url is http://cloudera:9000/
[28/Jul/2021 09:52:07 +0000] 1625 MainThread daemon INFO New monitor: (<cmf.monitor.host.HostMonitor object at 0x7fdcc96d0f90>,)
[28/Jul/2021 09:52:07 +0000] 1625 MonitorDaemon-Scheduler daemon INFO Monitor ready to report: ('HostMonitor',)
[28/Jul/2021 09:52:07 +0000] 1625 MainThread agent INFO Setting default socket timeout to 45
[28/Jul/2021 09:52:07 +0000] 1625 MainThread agent INFO Failed to read available parcel file: [Errno 2] No such file or directory: '/var/lib/cloudera-scm-agent/active_parcels.json'
[28/Jul/2021 09:52:07 +0000] 1625 MainThread agent INFO Loading last saved hb response to complete initialization: /var/lib/cloudera-scm-agent/response.avro
[28/Jul/2021 09:52:08 +0000] 1625 MainThread heartbeat_tracker INFO HB stats (seconds): num:1 LIFE_MIN:0.06 min:0.06 mean:0.06 max:0.06 LIFE_MAX:0.06 /etc/hosts: 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
# ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
# 127.0.1.1 cloudera cloudera
# cloudera cloudera
cloudera cloudera 'host cloudera' returns a Host cloudera not found: 3(NXDOMAIN) I have tried a lot of combinations with different addresses, listening_ips in the config.ini. Nothing seems to work in good health. I can find the host via CM, but the parcels wont download correctly; they go over 100% so I canceled at 120%. PS. pip 8.1.2 had issues with installing cm_client( or any other package) so I upgraded using the get-pip.py script for 2.7. MariaDB failed for all mirrors, so I installed it using yum localinstall and the .rpm package which I got using wget. EDIT: After changing uncommenting the cloudera address in /etc/hosts/ the script finally runs almost to the end, but gets stuck at downloading the parcel; The size was 7.2GB at first, but it just continued downloading till 7.4GB and now it is just stuck there. EDIT2: The parcel is stuck at 0% distributing now after finishing the download. Using ssh I can confirm that the parcel is at /opt/cloudera/parcel-repo Meanwhile the host is of unknown health, and the Cloudera Management Services can't be started(error communicating with server)
... View more