Created on 03-28-2017 06:46 AM - edited 03-28-2017 07:55 AM
I installed Cloudera using PATH B installation in 4 machines (VMs, Centos 7) 1 master and 3 slaves, after installation i get an error in clock synchronization in every slave, I resolve it when I do :
systemctl start ntpd
After a few minutes I get an error in master node and i can't display cloudera page (master:7180) although cloudera-scm-server status is running.
I noticed afterwards that the hard drive of Master node is full: when I do : df -h
I get :
[root@master ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 34G 34G 20K 100% / devtmpfs 4.1G 0 4.1G 0% /dev tmpfs 4.1G 0 4.1G 0% /dev/shm tmpfs 4.1G 8.7M 4.1G 1% /run tmpfs 4.1G 0 4.1G 0% /sys/fs/cgroup /dev/sda1 497M 212M 286M 43% /boot /dev/mapper/centos-home 17G 36M 17G 1% /home tmpfs 833M 0 833M 0% /run/user/0
I thought that maybe the ntpd log is behind all that.
if / dir is full (use% = 100%) so the master can't desplay any think.
Any help please to resolve this, and avoid hard disk bombardment of Master node.
This is the third I'm trying to install cloudera and every time I have the same problem.
Created 03-28-2017 07:10 AM
Can you run the following commands as root and identify which particular folder is consuming more space. Also once it returns a result, use the below command again with that folder name and dig further until you reach the correct sub folder
$ du -sh /
$ du -sh /*
Note: This is a disk space issue, I don't find anything related to memory in your description. So the topic and description are confusing
Created 03-28-2017 07:46 AM
when i do : du -sh /
i get :
du: cannot access ‘/proc/4982/task/4982/fd/4’: No such file or directory du: cannot access ‘/proc/4982/task/4982/fdinfo/4’: No such file or directory du: cannot access ‘/proc/4982/fd/4’: No such file or directory du: cannot access ‘/proc/4982/fdinfo/4’: No such file or directory 34G /
Created 03-28-2017 02:36 PM
Try running 'du -h / --max-depth=3|grep G' to figure out which path is using that space. Then drill down from there.
Created on 03-29-2017 12:57 AM - edited 03-29-2017 02:56 AM
I found the files using that space :
-rw-------. 1 cloudera-scm cloudera-scm 359M Mar 27 14:40 mgmt_mgmt-NAVIGATOR-9a89af62abe8393b48c78926720ffe2c_pid28766.hprof
It is repeated 40 times.
And :
-rw-------. 1 cloudera-scm cloudera-scm 761M Mar 27 15:10 mgmt_mgmt-NAVIGATORMETASERVER-9a89af62abe8393b48c78926720ffe2c_pid11739.hprof
It is repeated 12 times.
How to resolve this ?
Created 03-29-2017 03:06 AM
You can opt for stoping navigator, as navigator write huge amount of logs and your cluster can run without this service as well.
Created 03-29-2017 04:39 AM
Created 03-29-2017 06:05 AM
Created on 03-29-2017 06:28 AM - edited 03-29-2017 06:47 AM
Hi Jim,
How to change the Navigator configuration to allocate enough memory to the JVM
Created 03-29-2017 07:26 AM
Hello @ghandrisaleh,
This is a bit off topic, but you can configure Navigator Metadata Server Heap in Cloudera Manager via "Java Heap Size of Navigator Metadata Server in Bytes"
Navigator Audit Server:
Java Heap Size of Auditing Server in Bytes
-Ben
Created on 03-29-2017 07:40 AM - edited 03-29-2017 07:45 AM
after configure Navigator Metadata Server Heap, i m trying to restart Cloudera Management Service but i can't .
I get :
Cannot restart service when Host Monitor (master) is in STOPPING state
In the Host Monitor log file :
mars 29, 14:05:55.133 ERROR com.cloudera.cmon.firehose.Main Could not fetch descriptor after 5 tries, exiting.
and the number of files mgmt_mgmt-NAVIGATORMETASERVER* increased
Created 03-29-2017 08:30 AM
Please start a new thread for the issue trying to restart Host Monitor as this is not related to the existing one.
Restarting the Host Monitor is not required for restarting Navigator. In Cloudera Manager, click on Clusters --> Cloudera Management Service
Then, click the "Instances" subtab
check the boxes beside the Navigator server you want to restart and choose "restart" from the Actions for Selected drop-down button.
This allows you to restart the roles you desire only.
Created 03-29-2017 08:45 AM
I do not know why the number of files mgmt_mgmt-NAVIGATORMETASERVER* increases
mgmt_mgmt-NAVIGATORMETASERVER-9a89af62abe8393b48c78926720ffe2c_pid19656.hprof
Despite that I have increased java heap size
Created 03-29-2017 08:47 AM
Created 03-29-2017 09:11 AM
Created 03-29-2017 09:17 AM
Now cloudera work Fine but in Host Monitor log file I get an error :
Could not fetch descriptor after 5 tries, exiting.
And i can't restart this service, and when i'm trying to restart the Cloudera Management Service i get :
Cannot restart service when Host Monitor (master) is in STOPPING state.
Created on 03-29-2017 05:02 AM - edited 03-29-2017 05:11 AM
Hi Jim,
Thanks for sharing the knowledge ,i just missed this point.
However from my past experience, we used to have lots of logs and lack in space due to the same above issue and to resolve it we stop navigator.
For my own curiosity, I would like to know what will be the impact on cluster if I stop navigator ?
Created 03-29-2017 05:19 AM
Hi,
Cloudera Navigator provides auditing and data management. Removing it will not stop you from being able to run jobs on your cluster but you will not have fine grained auditing, metadata tagging etc.
Regards,
Jim