About cnauroth

cnauroth · ‎12-28-2015

An instance of FileSystem represents a Hadoop-compatible file system. It provides APIs that map to operations on that underlying file system. A FileStatus represents metadata for one particular file or directory in a file system. You can call the method FileSystem#getFileStatus to obtain a FileStatus instance for any file or directory stored in that file system.

cnauroth · ‎12-16-2015

A corrupted block means that HDFS cannot find a valid replica containing that block's data. Since replication factor is typically 3, and since the default replica placement logic spreads those replicas across different machines and racks, it's very unlikely to encounter corruption on typical files. Some users may choose to use a replication factor of 1 for non-critical files that can be recreated from some other system of record. This is an optimization that can save storage at the cost of reduced fault tolerance. If replication factor is only 1, then each block in that file is a single point of failure. Loss of the node hosting that 1 replica will cause the block to be reported as corrupted. HDFS can detect corruption of a replica caused by bit rot due to physical media failure. In that case, the NameNode will schedule re-replication work to restore the desired number of replicas by copying from another DataNode with a known good replica.

cnauroth · ‎12-16-2015

In HDFS, the NameNode metadata consists of fsimage files (checkpoints of the entire file system state) and edit logs (a sequence of transactions to be applied that alter the base file system state represented in the most recent checkpoint). There are various consistency checks performed by the NameNode when it reads these metadata files. The error message indicates that one of these consistency checks has failed. Specifically, the NameNode separately tracks the last known transaction ID that was previously present in edit logs in another file named seen_txid. If the transaction ID recorded in this file is not available in the edit logs when the NameNode is trying to load metadata at startup, then it aborts. It's difficult to say exactly how this could have happened in your environment without a deep review of configuration, logs and operations procedures. A potential explanation would be if the NameNode metadata was restored from a backup, and that backup contained the most recent fsimage (the checkpoint) but did not include the edit logs (the subsequent transactions). You might be interested in these additional resources that give further explanation of the NameNode metadata and suggestions on a possible backup plan. http://hortonworks.com/blog/hdfs-metadata-director... https://community.hortonworks.com/questions/4694/p...

cnauroth · ‎12-16-2015

Downloading an entire directory would be a recursive operation that walks the entire sub-tree, downloading each file it encounters in that sub-tree. The WebHDFS REST API alone doesn't implement any such recursive operations. (The recursive=true option for DELETE is a different case, because it's just telling the NameNode to prune the whole sub-tree. There isn't any need to traverse the sub-tree and return results to the caller along the way.) Recursion is something that would have to be implemented on the client side by listing the contents of a directory, and then handling the children returned for that directory. Depending on what you need to do, it might be sufficient to use the "hdfs dfs -copyToLocal" CLI command using a path with the "webhdfs" URI scheme and a wildcard. Here is an example: > hdfs dfs -ls webhdfs://localhost:50070/file* -rw-r--r-- 3 chris supergroup 6 2015-12-15 10:13 webhdfs://localhost:50070/file1 -rw-r--r-- 3 chris supergroup 6 2015-12-15 10:13 webhdfs://localhost:50070/file2 > hdfs dfs -copyToLocal webhdfs://localhost:50070/file* > ls -lrt file* -rw-r--r--+ 1 chris staff 6B Dec 16 10:23 file2 -rw-r--r--+ 1 chris staff 6B Dec 16 10:23 file1 In this example, the "hdfs dfs -copyToLocal" command made a WebHDFS HTTP call to the NameNode to list the contents of "/". It then filtered the returned results by the glob pattern "file*". Based on those filtered results, it then sent a series of additional HTTP calls to the NameNode and DataNodes to get the contents of file1 and file2 and write them locally. This isn't a recursive solution though. Wildcard glob matching is only sufficient for matching a static pattern and walking to a specific depth in the tree. It can't fully discover and walk the whole sub-tree. That would require custom application code.

cnauroth · ‎12-16-2015

There is a group mapping provider called CompositeGroupsMapping, which is capable of combining the groups returned from multiple other group mapping providers. The user's effective group memberships are then the union of all groups returned from the underlying group mapping providers. You could potentially set up CompositeGroupsMapping to combine results from AD and the local user database. Unfortunately, I don't believe there is any step-by-step documentation available that discusses CompositeGroupsMapping. Instead, you'd need to review Apache JIRA HADOOP-8943 and the attached patch to see how it works. There are also comments in core-default.xml that show example usage. https://github.com/apache/hadoop/blob/release-2.7.1/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml#L100-L190

cnauroth · ‎12-15-2015

Hello @Emily Sharpe. There is currently no way to skip writing the CRC file when running the -getmerge command. I filed Apache JIRA HADOOP-12643 to propose an enhancement to the command that would allow skipping the write of the CRC file. In the meantime, the best option is probably to use a scripting workaround, such as the suggestion from @Neeraj Sabharwal.

cnauroth · ‎12-10-2015

If you are running an HA NameNode using Quorum Journal Manager, then running the SecondaryNameNode is not required. Actually, it would be incorrect to deploy a SecondaryNameNode alongside an HA NameNode pair. Before implementation of HA with Quorum Journal Manager, the function of the SecondaryNameNode was to create a periodic checkpoint (a new fsimage file) of the NameNode metadata and upload it back to the NameNode. Without checkpointing, the NameNode's edit log would grow continuously. A very large edit log is problematic, because it slows down NameNode restarts. Replaying a large edit log is much more time-consuming than loading a recent metadata checkpoint and applying a small edit log on top of it. With an HA deployment, the standby NameNode in the pair takes over the responsibility of periodic checkpointing previously performed by the SecondaryNameNode. Therefore, it is unnecessary (and invalid) to run a SecondaryNameNode. If you choose not to deploy with HA for some reason, then the SecondaryNameNode is recommended so that you get periodic checkpoints. There is more discussion of this in the Apache documentation on NameNode HA using Quorum Journal Manager, particularly the section on hardware selection. http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Hardware_resources

cnauroth · ‎11-24-2015

The proposal looks basically sound. Here are a few other factors to consider. In addition to the files you mentioned, there is also a file named VERSION. This file is important, because it uniquely identifies the cluster and declares the version of the metadata format stored on disk. Without this file, it is impossible to restart the NameNode, so plan on including it in your backup strategy. Deploy monitoring on both NameNodes to confirm that checkpoints are triggering regularly. This helps reduce the amount of missing transactions in the event that you need to restore from a backup containing only fsimage files without subsequent edit logs. It's good practice to monitor this anyway, because huge uncheckpointed edit logs can cause long delays after a NameNode restart while it replays those transactions. For some additional background, here is a blog post I wrote a while ago explaining the HDFS metadata directories in more detail. http://hortonworks.com/blog/hdfs-metadata-directories-explained/

cnauroth · ‎11-23-2015

@Neeraj Sabharwal, thank you. I updated the answer to show an example of overriding the property from the DistCp command line.

cnauroth · ‎11-19-2015

There is no effective way to change block size "in place". The concept of block size is tightly tied to the on-disk layout of block files at DataNodes, so it's non-trivial to change this. As far as running a distributed job to do this, it's possible to use distcp with an override of the block size on the command line. (See example below.) This does however cause a temporary doubling of the storage consumed. > hadoop distcp -D dfs.blocksize=268435456 /input /output > hdfs dfs -stat 'name=%n blocksize=%o' /input/hello name=hello blocksize=134217728 > hdfs dfs -stat 'name=%n blocksize=%o' /output/hello name=hello blocksize=268435456

Online	Offline
Last Visited	‎01-13-2017 05:20 PM

Member Since	‎09-29-2015 10:51 PM
Last Visited	‎01-13-2017 05:20 PM
Posts	123
Kudos received	216

Cloudera Community

Re: How to debug the issue "IPC's epoch X is less ...

Re: Why hdfs://mycluster/ different from /

Re: querying a partition table

Re: NameNode HA Ambari Display Issue

Re: Tips for optimizing export to S3(n) ?

Re: Can we get file metadata using org.apache.hado...

Re: In HDFS, why corrupted block(s) happens?

Re: Namenode Txid Error

Re: How would you download (copy) a directory with...

Re: HDFS user group mapping with AD

Re: Is there an -ignoreCrc equivalent when using g...

Re: Is Secondary Name node is mandatory for any di...

Re: Possible scenarios for 'online' backups of HDF...

Re: Running distcp between two cluster: One Kerber...

Re: How can one change block size for large existi...