Member since
07-30-2019
111
Posts
185
Kudos Received
35
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2992 | 02-07-2018 07:12 PM | |
2211 | 10-27-2017 06:16 PM | |
2525 | 10-13-2017 10:30 PM | |
4738 | 10-12-2017 10:09 PM | |
1175 | 06-29-2017 10:19 PM |
12-01-2020
02:52 PM
The DataNodes should run the same software version as the NameNode.
... View more
08-01-2019
10:28 AM
I'm assuming you mean just to store the metadata of the changed snapshot and which isn't significant given the actual size of data held(in reference to my example above) Correct. However the metadata is tracked in NameNode memory which is a precious resource. The overhead can be significant in a large cluster with many files and millions of deltas.
... View more
08-01-2019
09:36 AM
The snapshot will not occupy any storage space on disk or NameNode heap immediately after it is created. However any subsequent changes inside the snapshottable directory will need to be tracked as deltas and that can result in both higher disk space and NameNode heap usage. E.g. if a file is deleted after taking a snapshot, the blocks cannot be reclaimed because the file is still accessible through the snapshot path. The hadoop fs -du shell command supports a -x option that allows calculating directory space usage excluding snapshots. The delta between the output with and without the -x option will tell you how much disk space is being consumed by the snapshot.
... View more
02-27-2018
10:20 PM
2 Kudos
Building Apache Tez with Apache Hadoop 2.8.0 or later fails due to client/server jar separation in Hadoop [1]. The build fails with the following error: [ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR] /src/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[48,30] cannot find symbol
symbol: class DistributedFileSystem
location: package org.apache.hadoop.hdfs
[ERROR] /src/tez/tez-api/src/test/java/org/apache/tez/client/TestTezClientUtils.java:[680,50] cannot find symbol
symbol: class DistributedFileSystem
location: class org.apache.tez.client.TestTezClientUtils
[ERROR] /src/tez/ecosystem/tez/tez-api/src/test/java/org/apache/tez/common/TestTezCommonUtils.java:[62,42] cannot access org.apache.hadoop.hdfs.DistributedFileSystem To get Tez to compile successfully, you will need to use the new hadoop28 profile introduced by TEZ-3690 [2]. E.g. here is how you compile Tez with Apache Hadoop 3.0.0: mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true -Phadoop28 -Dhadoop.version=3.0.0 References: 1. HDFS-6200: Create a separate jar for hdfs-client 2. TEZ-3690: Tez on hadoop 3 build failed due to hdfs client/server jar separation.
... View more
Labels:
02-07-2018
07:12 PM
1 Kudo
There is no such thing as a "passive" NameNode. Are you asking about the HA or non-HA configuration? In an HA configuration, there is Active NameNode that serves user requests. Standby NameNode that generates periodic checkpoints. It can also take over the role of the Active if the previously active NameNode dies or becomes unresponsive. In a non-HA configuration Primary NameNode that serves user requests. Secondary NameNode that generates periodic checkpoints. A secondary NameNode can never become the primary. The terminology is unfortunately confusing.
... View more
10-27-2017
06:16 PM
Try clearing up some snapshots. You probably have a ton of deleted files retained for snapshots.
... View more
10-27-2017
06:07 PM
Did you enable security using the Ambari Kerberos wizard? That usually takes care of these settings for you.
... View more
10-27-2017
04:58 PM
A few things to check for:
Are you starting the DataNode process as root? Have you set HADOOP_SECURE_DN_USER and JSVC_HOME? Since you are using a privileged port number (<1024), ensure you have not set dfs.data.transfer.protection. The Apache Hadoop documentation for Secure DN setup is good. https://hadoop.apache.org/docs/r2.7.4/hadoop-project-dist/hadoop-common/SecureMode.html#Secure_DataNode
... View more
10-26-2017
07:12 PM
It is likely the process has not hit an allocation failure yet so GC has not kicked in. This is perfectly normal. If you want the heap usage to be lower then you can reduce the heap allocation. Alternatively you can trigger GC quicker by adding something like -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly to your heap options. However it's probably best to just follow our suggested heap configuration and let the Java runtime do the rest. https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_command-line-installation/content/configuring-namenode-heap-size.html
... View more
10-13-2017
10:30 PM
@Dr. Jason Breitweg, it will not be deleted automatically. There may be block files under that directory that you need. If the cluster has any important data - I'd recommend running 'hdfs fsck' to ensure there are no missing/corrupt blocks before you delete /var/hadoop/hdfs/data/current/BP-*. Even then I'd first move the directory to a different location, restart DataNodes and rerun fsck to ensure you don't cause data loss.
... View more