About SDL

smdas · ‎10-14-2022

Hello @SDL This is an Old Thread & I assume your Team have moved on, yet wish to Update this Post for future references. It was observed that such Overnight Restart were resetting the default CleanUp (24 Hours) set via [1] in SolrConfig.XML of the respective Solr Collection (Sample from Ranger_Audits Collection). This caused the CleanUp to be postponed on a daily basis & causes Document PileUp beyond their Expiration. If Customer are restarting the Service nightly, It's advisable to set the CleanUp from 24 Hours to a Lower Value (Like, 20 or 22 Hours). Regards, Smarak [1] <processor class="solr.processor.DocExpirationUpdateProcessorFactory"> <int name="autoDeletePeriodSeconds">86400</int> <str name="ttlFieldName">_ttl_</str> <str name="expirationFieldName">_expire_at_</str> </processor>

abagal · ‎03-30-2021

Hi @Naush007 When you say remote server you mean a different host in the same network but on different cluster ? -copyToLocal is just for the same host on which you have the data. It will not copy to a different host. To do this you can use "DistCp" Distributed Copy tool to copy the data you want. Refer to below links for more information and how to use them. https://hadoop.apache.org/docs/current/hadoop-distcp/DistCp.html https://docs.cloudera.com/documentation/enterprise/5-5-x/topics/cdh_admin_distcp_data_cluster_migrate.html - If this is not what you are looking for please let us know what exactly needs to be achieved here. Thank you. If this helps you don't forget to click on accepted solution.

Daming Xue · ‎03-30-2021

Hello This is related to the Cache Management of HDFS As described in the documentation: https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html In this architecture, the NameNode is responsible for coordinating all the DataNode off-heap caches in the cluster. The NameNode periodically receives a cache report from each DataNode which describes all the blocks cached on a given DN. The NameNode manages DataNode caches by piggybacking cache and uncache commands on the DataNode heartbeat. If the metric is going up, one possibility could be your namenode is too busy to handle the request

SDL · ‎03-29-2021

You can update the configuration (remove single disk that you don't need) and perform a restart on one datanode at a time. Ensure that there are no under replicated blocks (namenode UI will show that) in between datanode restarts. I found this to be safest approach. SDL

Online	Offline
Last Visited	‎03-30-2021 12:01 PM

Member Since	‎01-18-2021 07:12 AM
Last Visited	‎03-30-2021 12:01 PM
Posts	6

Cloudera Community

Re: infra-solr autoDeletePeriodSeconds

Re: How to copy files from HDFS to remote server

Re: what is meaning of datanode numBlocksFailedToU...

Re: How to remove a data disk from all data nodes ...