Member since
07-30-2020
219
Posts
46
Kudos Received
60
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 4887 | 11-20-2024 11:11 PM | |
| 2926 | 09-26-2024 05:30 AM | |
| 2459 | 10-26-2023 08:08 AM | |
| 4193 | 09-13-2023 06:56 AM | |
| 4503 | 08-25-2023 06:04 AM |
10-28-2022
01:33 AM
2 Kudos
Hi @mike_bronson7 , The thumb rule is to have 1 Gb of heap allocated to 1 Million block. Now, as you have already doubled the heap, I think you can check on improving the garbage collector tunings. https://community.cloudera.com/t5/Community-Articles/NameNode-Garbage-Collection-Configuration-Best-Practices-and/ta-p/245276 You can try tunings the GC threads to see if that helps. Been said that, these pauses are not very huge to cause any real trouble and we don't see it repeating very frequently ( within mins) and can be expected on a busy Cluster. You can also check if the threshold to report a JVM pause alert can be adjusted to say 5-10 secs which needs some intervention.
... View more
10-28-2022
01:23 AM
Hi @TheFixer , During read, hbase has to fetch the blocks from other slave nodes if the locality of the regions are not good and this can contribute to latency as the block in not local to the client (Hbase ) and has to go over the network to fetch those provided the disks are fast enough. So you may want to run a major compaction on that table to see if that improves the read performance. Further, when the reads are performed the 1st time on a table that is written recently, the blocks are not cached in memory. So subsequent reads should give good performance comparatively as the blocks will be cache in the BlockCache.
... View more
10-21-2022
04:14 AM
Hi @hanumanth , The network bandwidth needs to be limited at the OS side using the tools such as traffic control (tc) which can be set on the NIC which carries the ip address for the CDH roles. For info on this can be found in the Red Hat docs : https://access.redhat.com/solutions/69133 https://access.redhat.com/solutions/1324033
... View more
10-18-2022
09:41 AM
Hi @fengsh Yes, The remove script indeed has few bugs using curl which was picked up from AMBARI-18435 but this curl call is not officially supported and as such we don't have an official doc on this. The other option to remove the older stack with "yum remove <version>" command, specifying the exact version you want to remove should get the job done. You can also verify the list of packages to be removed using (please be thorough!): # yum list installed | grep -P 'version' Additionally, please verify the symlinks in the "/usr/hdp/current/" folder are pointing to the correct location. -- Was your question answered? Please take some time to click on “Accept as Solution” below this post. If you find a reply useful, say thanks by clicking on the thumbs up button.
... View more
10-11-2022
02:12 AM
1 Kudo
Hi @fengsh , You can check the already solved below posts to see if that helps. https://community.cloudera.com/t5/Support-Questions/How-to-remove-an-old-HDP-version/m-p/116161 https://community.cloudera.com/t5/Support-Questions/Is-there-any-risk-to-delete-old-HDP-directories/m-p/96183 https://community.cloudera.com/t5/Community-Articles/Remove-Old-Stack-Versions-script-doesnt-work-in-ambari-2-7/ta-p/249303
... View more
10-05-2022
05:04 AM
1 Kudo
Hi, Those parameter won't be exposed by Ambari and would be false by default. The parameters would go into Custom spark-defaults. As they are disabled by default, I would suggest not to enable them.
... View more
09-28-2022
01:57 AM
Hi, Inside Spark, you can check for spark.history.ui.acls.enable and spark.acls.enable. These should be false by default. https://spark.apache.org/docs/2.4.3/security.html#authentication-and-authorization
... View more
09-19-2022
02:10 AM
Hi @Anlarin , It is always suggested to have a homogeneous disk storage across Datanodes. Within datanode, if there are heterogeneous volumes, then when the block replicas are written to new disks on a Round Robin fashion, the disks with less capacity will fill up faster compared to the disks with higher size. If the client is local to Node 2, then it will place the 1st block on that node and it's expected to fill faster. By choosing "Available Space Policy" the DNs would take into account how much space is available on each volume/disks when deciding where to place a new replica. To achieve writes that are evenly distribution in percentage of capacity on drives, change the choosing policy (dfs.datanode.fsdataset.volume.choosing.policy)to Available Space. If using Cloudera Manager: Navigate to HDFS > Configuration > DataNode Change DataNode Volume Choosing Policy from Round Robin to Available Space Click Save Changes Restart the DataNodes The above property only helps for volumes within Datanode. https://docs.cloudera.com/documentation/enterprise/latest/topics/admin_dn_storage_balancing.html - Was your question answered? Please take some time to click on “Accept as Solution” below this post. If you find a reply useful, say thanks by clicking on the thumbs up button.
... View more
09-15-2022
12:22 AM
Hi @isoardi , Seeing sockets in TIME_WAIT state is normal and is by design when the socket is getting closed. Unless we see tens of thousands of sockets in TIME_WAIT state which would consume the ephemeral ports on the host , these are fine. It would be the CLOSE_WAIT sockets we need check which indicates the application has not called the close() call on the socket. You can refer the below RedHat documentation for more info on this and ways to close the TW sockets by reusing them. https://access.redhat.com/solutions/24154
... View more
09-15-2022
12:11 AM
Hi @abdebja , You can refer the instructions provided in the below Cloudera Article to mitigate this issue. https://my.cloudera.com/knowledge/tmp-folder-filling-up-frequently-with-hprof-dump-files?id=340673 - Was your question answered? Please take some time to click on “Accept as Solution” below this post. If you find a reply useful, say thanks by clicking on the thumbs up button.
... View more