Member since
01-19-2017
3681
Posts
633
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1640 | 06-04-2025 11:36 PM | |
| 2089 | 03-23-2025 05:23 AM | |
| 997 | 03-17-2025 10:18 AM | |
| 3776 | 03-05-2025 01:34 PM | |
| 2601 | 03-03-2025 01:09 PM |
08-28-2020
12:06 PM
@mahfooz The property value can be modified only in hive-site.xml cluster configuration file. This will oblige you to restart the stale hive configuration and becomes a cluster-wide change rather than a runtime change. HTH
... View more
08-14-2020
12:38 AM
1 Kudo
@mike_bronson7 Let me try to answer all your 3 questions in a shot [snapshot] Zookeeper has 2 types of logs the snapshot and transactional log files. As changes are made to the znodes i.e addition or deletion of znodes these changes are appended to a transaction log, occasionally, when a log grows large, a snapshot of the current state of all znodes will be written to the filesystem. This snapshot supersedes all previous logs. To put you in context it's like the edit-logs and the fsimage in Namenode architecture, all changes made in the HDFS is logged in the edits-logs in secondary Namenode when a checkpoint kick in it merges the edits log with the old fsimage to incorporate the changes ever since the last checkpoint. So zk snapshot is synonym to the fsimage as it contains the current state of the znode entries and ACL's Snapshot policy In the earlier command shared the snapshot count parameter -n <count> if you really want to have sleep then you can increment it to 5 or 7 but I think 3 suffice to use the autopurge feature so I keep only 3 snapshots and 3 transaction logs. When enabled, ZooKeeper auto-purge feature retains the autopurge.snapRetainCount most recent snapshots and the corresponding transaction logs in the dataDir and dataLogDir respectively and deletes the rest. Defaults to 3. The minimum value is 3. Corrupt snapshots The Zookeeper might not be able to read its database and fail to come up because of some file corruption in the transaction logs of the ZooKeeper server. You will see some IOException on loading the ZooKeeper database. In such a case, make sure all the other servers in your ensemble are up and working. Use the 4 letters command "stat" command on the command port to see if they are in good health. After you have verified that all the other servers of the ensemble are up, you can go ahead and clean the database of the corrupt server. Solution Delete all the files in datadir/version-2 and datalogdir/version-2/. Restart the server. Hope that helps
... View more
08-13-2020
01:36 PM
1 Kudo
@mike_bronson7 A ZooKeeper server will not remove old snapshots and log files when using the default configuration auto-purge this is the responsibility of the operator as every environment is different and therefore the requirements of managing these files may differ from install to install. The PurgeTxnLog utility implements a simple retention policy that administrators can use. In the below example the last count snapshots and their corresponding logs are retained and the others are deleted. The value of <<count>> should typically be greater than 3 although not required; this provides 3 backups in the unlikely event a recent log has become corrupted. This can be run as a cron job on the ZooKeeper server machines to clean up the logs daily. java -cp zookeeper.jar:lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count> Automatic purging of the snapshots and corresponding transaction logs was introduced in version 3.4.0 and can be enabled via the following configuration parameters autopurge.snapRetainCount and autopurge.purgeInterval Hope that helps!
... View more
08-11-2020
05:32 AM
@ashish_inamdar Have you enabled ranger hive plugin? If so then ensure the Atlas user has a Ranger policy that allows the user to correct database and tables permissions. Because once Ranger hive plugin has been enabled you MUST use Ranger for authorization. Hope that helps
... View more
07-26-2020
11:18 AM
1 Kudo
@mike_bronson7 log.retention.bytes is a size-based retention policy for logs, i.e the allowed size of the topic. Segments are pruned from the log as long as the remaining segments don't drop below log.retention.bytes. You can also specify retention parameters at the topic level To specify a retention time period per topic, use the following command. kafka-configs.sh --zookeeper [ZooKeeperConnectionString] --alter --entity-type topics --entity-name [TopicName] --add-config retention.ms=[DesiredRetentionTimePeriod] To specify a retention log size per topic, use the following command. kafka-configs.sh --zookeeper [ZooKeeperConnectionString] --alter --entity-type topics --entity-name [TopicName] --add-config retention.bytes=[DesiredRetentionLogSize] That should resolve your problem Happy hadooping
... View more
07-23-2020
02:16 AM
@focal_fossa Great to hear happy hadooping! Maybe to help other mark the best answer that helped you resolve your problem so other searching for similar solution would use it to resolve similar issues.
... View more
07-22-2020
09:37 AM
1 Kudo
@focal_fossa To increase the HDFS capacity add capacity by giving dfs.datanode.data.dir more mount points or directories the new disk need to be mounted/formatted prior to adding the mount point in Ambari. In HDP using Ambari, you should add the new mount point to the list of dirs in the dfs.datanote.data.dir property. Depending the version of Ambari or in advanced section, the property is in hdfs-site.xml. the more new disk you provide through comma separated list the more capacity you will have. Preferably every machine should have same disk and mount point structure You will need to run the HDFS balancer re-balances data across the DataNodes, moving blocks from overutilized to underutilized nodes Running the balancer without parameters: sudo -u hdfs hdfs balancer Running the balancer with a default threshold of 10%, meaning that the script will ensure that disk usage on each DataNode differs from the overall usage in the cluster by no more than 10%. You can use a different threshold sudo -u hdfs hdfs balancer -threshold 5 This specifies that each Datanode's disk usage must be (or will be adjusted to be) within 5% of the cluster's overall usage This process can take long depending on data in your cluster Hope that helps
... View more
07-21-2020
07:55 AM
@Stephbat Those are internals to Cloudera and that confirms myth migration/upgrades are never smooth we still need humans 🙂 Please do those changes and let me know if your datanodes fires up correctly.
... View more
07-21-2020
06:50 AM
@Stephbat Please can you check those 2 values dfs.datanode.max.locked.memory and ulimit The dfs.datanode.max.locked.memory determines the maximum amount of memory a DataNode will use for caching. The "locked-in-memory size" corresponds to ulimit (ulimit -l) of the DataNode user that needs to be increased to match this parameter. The current dfs.datanode.max.locked.memory is 2 GB and while the RLIMIT_MEMLOCK is 16 MB If you get the error “Cannot start datanode because the configured max locked memory size… is more than the datanode’s available RLIMIT_MEMLOCK ulimit,” that means that the operating system is imposing a lower limit on the amount of memory that you can lock than what you have configured. To fix this, you must adjust the ulimit -l value that the DataNode runs with. Usually, this value is configured in /etc/security/limits.conf. However, it will vary depending on what operating system and distribution you are using please adjust the values accordingly remember that you will need space in memory for other things as well, such as the DataNode and application JVM heaps and the operating system page cache. Once adjust the datanode should start as a charm 🙂 Hope that helps
... View more
07-21-2020
01:56 AM
@focal_fossa Can you share how method you used to extend you VM disk? Whats the VM disk file extension vmdk or vdi? Note virtualbox does not allow resizing on vmdk images. Does you disk show Dynamically allocated storage as shown below? Please revert
... View more