Member since
01-18-2021
6
Posts
0
Kudos Received
0
Solutions
10-14-2022
03:26 AM
Hello @SDL This is an Old Thread & I assume your Team have moved on, yet wish to Update this Post for future references. It was observed that such Overnight Restart were resetting the default CleanUp (24 Hours) set via [1] in SolrConfig.XML of the respective Solr Collection (Sample from Ranger_Audits Collection). This caused the CleanUp to be postponed on a daily basis & causes Document PileUp beyond their Expiration. If Customer are restarting the Service nightly, It's advisable to set the CleanUp from 24 Hours to a Lower Value (Like, 20 or 22 Hours). Regards, Smarak [1] <processor class="solr.processor.DocExpirationUpdateProcessorFactory"> <int name="autoDeletePeriodSeconds">86400</int> <str name="ttlFieldName">_ttl_</str> <str name="expirationFieldName">_expire_at_</str> </processor>
... View more
03-30-2021
12:22 PM
1 Kudo
Hi @Naush007 When you say remote server you mean a different host in the same network but on different cluster ? -copyToLocal is just for the same host on which you have the data. It will not copy to a different host. To do this you can use "DistCp" Distributed Copy tool to copy the data you want. Refer to below links for more information and how to use them. https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html https://docs.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_admin_distcp_data_cluster_migrate.html - If this is not what you are looking for please let us know what exactly needs to be achieved here. Thank you. If this helps you don't forget to click on accepted solution.
... View more
03-30-2021
02:13 AM
1 Kudo
Hello This is related to the Cache Management of HDFS As described in the documentation: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html In this architecture, the NameNode is responsible for coordinating all the DataNode off-heap caches in the cluster. The NameNode periodically receives a cache report from each DataNode which describes all the blocks cached on a given DN. The NameNode manages DataNode caches by piggybacking cache and uncache commands on the DataNode heartbeat. If the metric is going up, one possibility could be your namenode is too busy to handle the request
... View more
03-29-2021
07:47 AM
You can update the configuration (remove single disk that you don't need) and perform a restart on one datanode at a time. Ensure that there are no under replicated blocks (namenode UI will show that) in between datanode restarts. I found this to be safest approach. SDL
... View more