Member since
01-19-2017
3676
Posts
632
Kudos Received
372
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 491 | 06-04-2025 11:36 PM | |
| 1036 | 03-23-2025 05:23 AM | |
| 538 | 03-17-2025 10:18 AM | |
| 2027 | 03-05-2025 01:34 PM | |
| 1265 | 03-03-2025 01:09 PM |
08-14-2020
12:38 AM
1 Kudo
@mike_bronson7 Let me try to answer all your 3 questions in a shot [snapshot] Zookeeper has 2 types of logs the snapshot and transactional log files. As changes are made to the znodes i.e addition or deletion of znodes these changes are appended to a transaction log, occasionally, when a log grows large, a snapshot of the current state of all znodes will be written to the filesystem. This snapshot supersedes all previous logs. To put you in context it's like the edit-logs and the fsimage in Namenode architecture, all changes made in the HDFS is logged in the edits-logs in secondary Namenode when a checkpoint kick in it merges the edits log with the old fsimage to incorporate the changes ever since the last checkpoint. So zk snapshot is synonym to the fsimage as it contains the current state of the znode entries and ACL's Snapshot policy In the earlier command shared the snapshot count parameter -n <count> if you really want to have sleep then you can increment it to 5 or 7 but I think 3 suffice to use the autopurge feature so I keep only 3 snapshots and 3 transaction logs. When enabled, ZooKeeper auto-purge feature retains the autopurge.snapRetainCount most recent snapshots and the corresponding transaction logs in the dataDir and dataLogDir respectively and deletes the rest. Defaults to 3. The minimum value is 3. Corrupt snapshots The Zookeeper might not be able to read its database and fail to come up because of some file corruption in the transaction logs of the ZooKeeper server. You will see some IOException on loading the ZooKeeper database. In such a case, make sure all the other servers in your ensemble are up and working. Use the 4 letters command "stat" command on the command port to see if they are in good health. After you have verified that all the other servers of the ensemble are up, you can go ahead and clean the database of the corrupt server. Solution Delete all the files in datadir/version-2 and datalogdir/version-2/. Restart the server. Hope that helps
... View more
08-13-2020
01:36 PM
1 Kudo
@mike_bronson7 A ZooKeeper server will not remove old snapshots and log files when using the default configuration auto-purge this is the responsibility of the operator as every environment is different and therefore the requirements of managing these files may differ from install to install. The PurgeTxnLog utility implements a simple retention policy that administrators can use. In the below example the last count snapshots and their corresponding logs are retained and the others are deleted. The value of <<count>> should typically be greater than 3 although not required; this provides 3 backups in the unlikely event a recent log has become corrupted. This can be run as a cron job on the ZooKeeper server machines to clean up the logs daily. java -cp zookeeper.jar:lib/slf4j-api-1.6.1.jar:lib/slf4j-log4j12-1.6.1.jar:lib/log4j-1.2.15.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count> Automatic purging of the snapshots and corresponding transaction logs was introduced in version 3.4.0 and can be enabled via the following configuration parameters autopurge.snapRetainCount and autopurge.purgeInterval Hope that helps!
... View more
08-13-2020
01:09 PM
@BBFayz Below is the link with all the passwords you could be interested in. The default root password by default is root/hadoop but you will be requested to change that on the first logon Sandbox passwords learning rope Setup Static IP on RHEL Hope that helps
... View more
08-11-2020
02:33 PM
1 Kudo
@kvinod Can you share the steps you have accomplished and attach the specific errors that you are encountering?
... View more
08-11-2020
05:32 AM
@ashish_inamdar Have you enabled ranger hive plugin? If so then ensure the Atlas user has a Ranger policy that allows the user to correct database and tables permissions. Because once Ranger hive plugin has been enabled you MUST use Ranger for authorization. Hope that helps
... View more
07-30-2020
03:19 PM
@Seaport I would think there is a typo error the dash [-] after - put and before the hdfs path hdfs dfs -cat /user/testuser/stage1.tar.gz | gzip -d | hdfs dfs -put - /user/testuser/test3 try this after removing the dash - hdfs dfs -cat /user/testuser/stage1.tar.gz | gzip -d | hdfs dfs -put /user/testuser/test3 Hope that helps
... View more
07-30-2020
01:41 PM
@Seaport It shouldn't be strange to you that Hadoop doesn't perform well with small files, now with that in mind the best solution would be to zip all your small files locally and then copy the zipped file to hdfs using copyFromLocal there is one restriction that is the source of the files can only be on a local file system. I assume the local Linux box had is the edge node and had the hdfs client installed. If not you will have to copy the myzipped.gz to a node usually the edge node and perform the below steps $ hdfs dfs -copyFromLocal myzipped.gz /hadoop_path". Then unzip the myzipped.gz gzipped file in HDFS using $ hdfs dfs -cat /hadoop_path/myzipped.gz | gzip -d | hdfs dfs -put - /hadoop_path2 Hope that helps
... View more
07-29-2020
03:25 PM
1 Kudo
@Stephbat Bizarre all the symlinks should point to the newer version 3.1.4.0-315. You should recreate the symlink point to the new version. The re-run the steps
... View more
07-29-2020
06:45 AM
1 Kudo
@Stephbat Did you follow the Workaround:1 through 3? I see you have a similar error File "/var/lib/ambari-agent/cache/custom_actions/scripts/remove_previous_stacks.py 1) go to each ambari-agent node and edit the file remove_previous_stacks.py # vi /var/lib/ambari-agent/cache/custom_actions/scripts/remove_previous_stacks.py 2) go to line: 77 edit the line from : all_installed_packages = self.pkg_provider.all_installed_packages() to all_installed_packages = self.pkg_provider.installed_packages() 3) Retry the operation via curl again, please substitute only the Ambari_host below ex: curl 'http://Ambari_host:8080/api/v1/clusters/<cluster_name>/requests' -u admin:admin -H "X-Requested-By: ambari" -X POST -d'{"RequestInfo":{"context":"remove_previous_stacks", "action" : "remove_previous_stacks", "parameters" : {"version":"3.1.0.0-78"}}, "Requests/resource_filters": [{"hosts":"Ambari_host"}]}' And please revert
... View more