About mliu

aps · ‎07-06-2021

@diplompils It is not necessary that file is lost if you are getting the output as false for recoverLease command. Usually file can't be deleted until it has lease acquired and not explicitly deleted using rm command. You can try below- hdfs debug recoverLease -path <file> -retries 10 Or you may check - https://issues.apache.org/jira/browse/HDFS-8576

mliu · ‎09-21-2016

You have a need of debugging, testing and operating a Hadoop cluster, especially when you run dangerous dfsadmin commands, try customized packages with changes of Hadoop/Spark source code, trying aggressive configuration values. You have a laptop and you have a production Hadoop cluster. You don't dare to operate the production cluster blindly, which is appreciated by your manager. You want to try something on a hadoop cluster and even you breaks it, no one blames you. You have several choices (perhaps you're using one of them now): psudo-distributed Hadoop cluster on a single machine, which is nontrivial to run HA, to use per-node configurations, to pause and launch multiple nodes, or to test HDFS balancer/mover etc. setting up a real cluster, which is complex and heavy to use, and in the first place you can afford a real cluster. building Ambari cluster using vbox/vmware virtual machines, nice try. But if you run 5 nodes cluster, you'll see your CPU is overloaded and memory is eaten up. How about using Docker containers instead of virtualbox virtual machines? Caochong is a tool that does this exactly! Specially, it outperforms its counterparts in that it is: Customizable: you can specify the cluster specs easily, e.g. how many nodes to launch, Ambari version, Hadoop version repository, per-node Hadoop configurations. Meanwhile, you have the choice of full Hadoop eco-system stack, HDFS, Yarn, Spark, Hbase, Hive, Pig, Oozie... you name one! Lightweight: imagine your physical machine can run as many containers as you wish. I ran 10 without any problem (well, my laptop was made slow though). Using docker, you can also pause and start the containers (consider you have to restart your laptop for an OS security update, you will need a snapshot, right). Standard: The caochong tool employs Apache Ambari to set up a cluster, which is a tool for provisioning, managing, and monitoring Apache Hadoop clusters. Automatic: you don't have to be Ambari, Docker or Hadoop experts to use it! To use caochong, you only need to follow 9 steps. Only nine, indeed! 0. Download caochong, and install Docker. 1. [Optional] Choose Ambari version in from-ambari/Dockerfile file (default Ambari 2.2) 2. Run from-ambari/run.sh to set up an Ambari cluster and launch it $ ./run.sh --help Usage: ./run.sh [--nodes=3] [--port=8080] --nodes Specify the number of total nodes --port Specify the port of your local machine to access Ambari Web UI (8080 - 8088) 3. Hit http://localhost:port from your browser on your local computer. The port is the parameter specified in the command line of running run.sh. By default, it is http://localhost:8080. NOTE: Ambari Server can take some time to fully come up and ready to accept connections. Keep hitting the URL until you get the login page. 4. Login the Ambari webpage with the default username:password is admin:admin. 5. [Optional] Customize the repository Base URLs in the Select Stack step. 6. On the Install Options page, use the hostnames reported by run.sh as the Fully Qualified Domain Name (FQDN). For example: ------------------------------ Using the following hostnames: 85f9417e3d94 9037ffd878dk b5077ffd9f7f ------------------------------ 7. Upload from-ambari/id_rsa as your SSH Private Key to automatically register hosts when asked. 8. Follow the onscreen instructions to install Hadoop (YARN + MapReduce2, HDFS) and Spark. 9. [Optional] Log in to any of the nodes and you're all set to use an Ambari cluster! # login to your Ambari server node $ docker exec -it caochong-ambari-0 /bin/bash To know more or to get updates, please star the Caochong project at GitHub.com.

mliu · ‎08-10-2016

1. Recover the lease for the file When you do "hdfs dfs -cat file1" from the command line, you get the exception saying that it "Cannot obtain block length for LocatedBlock". Usually this means the file is still in being-written state, i.e., it has not been closed yet, and the reader cannot successfully identify its current length by communicating with corresponding DataNodes. Suppose you're pretty sure the writer client is dead, killed, or lost connection to the servers. You're wondering what else you can do other than waiting. hdfs debug recoverLease -path <path-of-the-file> [-retries <retry-times>] This command will ask the NameNode to try to recover the lease for the file, and based on the NameNode log you may track to detailed DataNodes to understand the states of the replicas. The command may successfully close the file if there are still healthy replicas. Otherwise we can get more internal details about the file/block state. Please refer to https://community.hortonworks.com/questions/37412/cannot-obtain-block-length-for-locatedblock.html for discussion, especially answer made by @Jing Zhao. This is a lightweight operation so the server should not crash if you run it. This is an idempotent operation so the server should not crash if you run the this command multiple times against the same path. 2. Trigger block report on DataNodes You think a DataNode is not stable and you need to update, or you think there is a potential unknown bug in name-node (NN) replica accounting and you need to work around. As an operator, if you suspect such an issue, you might be tempted to restart a DN, or all of the DNs in a cluster, in order to trigger full block reports. It'd be much lighter weight if instead you could just manually trigger a full BR instead of having to restart the DN and therefore need to scan all the DN data dirs, etc. hdfs dfsadmin -triggerBlockReport [-incremental] <datanode_host:ipc_port> This command is to help you. If "-incremental" is specified, it will be incremental block report (IBR). Otherwise, it will be a full block report. 3. Verify block metadata Say you have a replica, and you don't know whether it's corrupt. hdfs debug verify -meta <metadata-file> [-block <block-file>] This command is to help you verify a block's metadata. Argument "-meta <metadata-file>" is the absolute path for the metadata file on the local file system of the data node. Argument "-block <block-file>" is an optional parameter to specify the absolute path for the block file on the local file system of the data node. 4. Dump NameNode's metadata You want to dump NN's primary data structures. hdfs dfsadmin -metasave filename This command is to save NN's meta data to filename in the directory specified by hadoop.log.dir property. "filename" is overwritten if it exists in the command line. The filename will contain one line for each of the following: Datanodes heart beating with Namenode Blocks waiting to be replicated Blocks currently being replicated Blocks waiting to be deleted 5. Get specific Hadoop config You want to know one specific config. You're smart to use Ambari UI while you need a web browser, which you don't always have as you SSH to the cluster from home. You turn to the configuration files and search for the config key. Then you find something special like in the configuration files. For example, Embedded file substitutions. XInclude in the XML files are popular in Hadoop world, you know. Property substitutions. Config A's value refers to config B's value, and again, config C's value. You're in an urgent issue and tired of parsing the configuration files manually. You can do better. "Any scripting approach that tries to parse the XML files directly is unlikely to accurately match the implementation as its done inside Hadoop, so it's better to ask Hadoop itself." It's always been true. hdfs getconf -confKey <key> This command is to show you the actual, final results of any configuration properties as they are actually used by Hadoop. Interestingly, it is capable of checking configuration properties for YARN and MapReduce, not only HDFS. Tell your YARN friends about this. For more information, please refer to stackoverflow discussion, especially answers made by @Chris Nauroth.

liuml07 · ‎05-13-2018

@Dinesh Chitlangia You are right. The current homebrew has deprecated the `homebrew/versions` and the new migrated protobuf@2.5 installation does not change the default bin in PATH. As you suggested, you can link it in your PATH, or follow the instruction of `brew info protobuf@2.5`

namaheshwari · ‎05-30-2016

Thanks @Jing Zhao for the answer.

Online	Offline
Last Visited	‎03-23-2017 12:35 AM

Member Since	‎09-15-2015 02:08 AM
Last Visited	‎03-23-2017 12:35 AM
Posts	14
Kudos received	40

Cloudera Community

Re: Check opening files on HDFS

Re: HDP support of the Chinese language

Re: Cannot obtain block length for LocatedBlock

Setting up a Hadoop/Spark cluster with Docker on a...

5 Infrequently Known Commands To Debug Your HDFS I...

Re: Setting Hadoop development environment on Mac ...

Re: Possible reasons that cause slow name node (NN...