Member since
05-15-2019
42
Posts
20
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1823 | 11-21-2017 04:48 AM | |
537 | 05-02-2017 02:55 AM |
10-05-2018
05:41 PM
This could be KDC server issue. Please check if you are hitting this: https://bugzilla.redhat.com/show_bug.cgi?id=1560951
... View more
11-21-2017
05:42 AM
@suresh krish As the referred article too suggests, if you were to attempt recovering accidentally deleted files in Production cluster's HDFS, then it is recommended to immediately stop all the DataNodes in the cluster and seek support's help to go through the process. When it is about recovering production data, it is very important that one has very clear understanding of the recovery procedure, knows all the precautions and checks to be taken care of and is confident on how to proceed if any of the steps fail.
... View more
11-21-2017
05:17 AM
If you have 3 or more zookeeper servers, then you could carry out these steps on each Zookeeper one by one, in a rolling fashion, thereby keeping the zookeeper quorum intact. Otherwise, the mentioned steps are fine. Copy all files inside current dataDir (myid , version-2) to the new directory, update 'dataDir' and 'dataLogDir' (if separately configured) properties inside zoo.cfg, set the directory ownership (recursive) to zk service user and restart the zookeeper server. During the startup, zookeeper loads the latest 'snapshot' file and replays the transaction file to load the state. Follower zookeepers also syncs with the leader for the current state.
... View more
11-21-2017
04:48 AM
1 Kudo
Verify whether message.max.bytes (or max.message.bytes for the topic) is set to an appropriate value to support the large messages. And to enable the consumers to read from this topic, set fetch.message.max.bytes as well.
... View more
05-27-2017
04:24 AM
@Mathi Murugan The stats about '0 regions per RegionServer' as seen in Ambari indicates that the HMaster might be waiting on completing its initialization since region assignments for namespace and meta tables are still in progress. You may want to restart the complete HBase service again to check, and/or run 'tail -f' on the HMaster and RegionServer logs to find the current on going activities on them. Another place to look for information is the HBase Master UI - use the 'Quick Links' from the hbase service page to native to that.
... View more
05-26-2017
04:55 PM
7 Kudos
Zookeeper is one of the most critical components in an HDP cluster, but it is also one that is given least importance usually when tuning cluster for performance and while troubleshooting slowness in a cluster. Here is a basic checklist for zookeeper health check that one must go through to ensure that Zookeeper is running fine. Let's keep the zookeeper happy to be able to better manage the occupants of the zoo 🙂
1. Are all the Zookeeper servers given dedicated disks for transaction log directory ('dataDir' / 'dataLogDir') ? It is very important to have fast disks to complete 'fsync' of new transactions to the log, where zookeeper writes before any update takes place and before sending a response back to the client. Slower 'fsync' for transaction log is one of the most common reasons seen in the past for slower zookeeper response. Yes, the disk space requirement is usually not very high by the zookeeper and one might wonder if its worth to dedicate a complete disk to zookeeper log directory, but its required to prevent I/O operations by other applications/processes from keeping the disk busier. Some of the common symptoms to be noticed if zookeeper finds slower writes to transactional log are:
Services such as NameNode zkfc and HBase Region servers, that uses ephemeral znodes to track its liveliness, shuts down after repeated zookeeper server connection timeouts. The zookeeper server log frequently reports errors such as: WARN [SyncThread:2:FileTxnLog@321] - fsync-ing the write ahead log in SyncThread:2 took 7050ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2. Is the zookeeper process given enough heap memory, according to the number of znodes, clients and watchers connecting the zookeepers. To arrive at the right zookeeper heap size, one has to run load tests and find the estimate on required heap size. Insufficient memory allocation for zookeepers can affect its performance once it goes through very frequent GC cycles when the heap usage reaches close to 100% of its total heap size allocation. The following four letter zookeeper commands provide many useful information about the running zookeeper instances: # echo 'stat' | nc <ZK_HOST> 2181
# echo 'mntr' | nc <ZK_HOST> 2181 In the above command output, watch for numbers against the stats such as znode count, number of watchers, number of client connections and max/avg latency among other things. In most cases a heap size between 2GB and 4GB should be a good, but as mentioned above, this depends on the kind of load on the zookeeper. In addition to the above mentioned 'four letter' commands, it is also recommended to keep an eye on the increasing heap size and the GCs, especially during the time of slowness, using tools such as: # sudo su - zookeeper ; jmap -heap <ZK_PID>
# sudo su - zookeeper ; jstat -gcutil 2000 10 <ZK_PID>
3. Are there too many zookeepers in the ensemble ? Three ZooKeeper servers is the minimum recommended size for an ensemble. And in most cases, three zookeepers are good enough too. Increased number of zookeeper servers, although gives more reliability (a 7 node ensemble can withstand loss of 3 nodes compared to the tolerance of 1 node loss in case of a 3 three node ensemble), and better read throughput when there are large number of concurrent clients connected, it can lead to slower write operations since every update/write operation is required to be committed by atleast half of the nodes in an ensemble. Some alternatives to prevent the slower writes arising due to larger ensembles are:
Use dedicated zookeeper ensemble for certain workloads in the cluster For larger ensemble, use zookeeper observers - Ref. http://zookeeper.apache.org/doc/trunk/zookeeperObservers.html (although configuration of zookeeper observer is not supported in the current Ambari version as of this writing).
4. Are the 'dataDir' / 'dataLogDir' filling up too fast ? As mentioned above, every transaction to zookeepers are written to the transaction log file. When a large number of concurrent ZK clients continuously connects and does very frequent updates, possibly due to an error condition at the client, it can lead to the transaction logs getting rolled over multiple times in a minute due to its steadily increasing size and thus resulting in a large number of Snapshot files as well. This can further cause disks running out of free space. For such issues, one has to identify and fix the client application. Review the stats from above in addition to zookeeper logs and/or the latest transaction log, to find the latest updates on the znodes using 'logFormatter' tool: # java -cp /usr/hdp/current/zookeeper-server/*:/usr/hdp/current/zookeeper-server/lib/* org.apache.zookeeper.server.LogFormatter /hadoop/zookeeper/version-2/log.xxxxx Further, the zookeeper properties - 'autopurge.snapRetainCount' and 'autopurge.purgeInterval' have to be tuned according to the required retention count and the frequency to limit the increasing number of transaction log and snapshot files.
... View more
- Find more articles tagged with:
- Hadoop Core
- How-ToTutorial
- Zookeeper
- zookeeper-performance
- zookeeper-slow
05-02-2017
02:55 AM
1 Kudo
Although this was answered earlier over a support ticket, updating the details here for any future visitor. Phoenix in HDP 2.5 and above includes PHOENIX-1734 - where the Phoenix local indexes are co-located in the same region where the corresponding data exists, although on a different column family. In the above explain plan, the output actually tells that the local index is in use: "RANGE SCAN OVER TEST_TABLE [1,'v1-2']" --> Means its a range scan (instead of a full table scan otherwise), on the data table using the local index (type 1 = local index) and the given value of "v1-2".
... View more
05-02-2017
02:42 AM
@srini Sorry, I missed to notice this question earlier. Yes, the schema is related to an index as well. If no schema is associated with a table then you would just use the table name or index name in the command as: > alter table INDEXNAME SET COMPRESSION=snappy; I hope that answers the question.
... View more
05-01-2017
02:15 PM
Phoenix uses its own encoding for the ascii fields and hence there is difference when you load the data directly to the hbase table. In this case it is required to load data into phoenix table instead of directly to hbase. Since loading data directly from sqoop -> phoenix is not yet supported feature ( SQOOP-2649 ), one of the options here would be: oracle -> csv -> phoenix
... View more
04-24-2017
06:27 PM
Although an old post, updating here for any future visitors: This issue occurs in most cases due to one or more of the brokers being unavailable, if not any of the above mentioned ones. The controller node completes a topic deletion only after all the topic's partition replicas are removed from all the brokers. So validate that brokers that are currently online by checking the following znode, since sometimes although the broker processes are in running status, they may not be actually part of the cluster because of various reasons such as memory contention and continuous GC cycles: zk> ls /brokers/ids
... View more
04-03-2017
08:29 PM
2 Kudos
@srini
While creating a new secondary index for a table, we can use the command such as below to specify the compression type: > create index INDEXNAME on SCHEMA.TABLENAME(COLUMN) COMPRESSION=snappy; And to alter the compression for an existing index table, run the following command from phoenix (notice that the command is 'alter table' instead of alter index here) > alter table SCHEMA.INDEXNAME SET COMPRESSION=snappy
... View more
02-08-2017
06:37 PM
A quick check would be to ensure that you've used the right parent znode (as you've used /hbase) in the connection string, defined by 'zookeeper.znode.parent' inside hbase-site.xml. It usually looks like /hbase-secure or /hbase-unsecure, based on whether or not kerberos is enabled in the cluster. And if that looks good, validate the connection string by using sqlline.py: /usr/hdp/current/phoenix-client/bin/sqlline.py 10.40.17.183,10.40.17.155,10.40.17.129:2181:/hbase
... View more
02-07-2017
10:17 AM
@Sonny Heer I missed to notice your questions. To answer- 1. This approach is more suitable if you were to use physical hardware and run HDP cluster nodes on docker instances. 2. Instead of using docker compose to define the cluster, here we prepare ambari agent/server images for various versions and use the scripts to setup the required infrastructure, and bring up a cluster. 3. At the moment, we can't enable kerberos right when the cluster is created, but there are scripts in there to quickly kerberize an hdp cluster once it is created.
... View more
02-07-2017
09:27 AM
1 Kudo
@Joginder Sethi It appears as if the 'install' script didn't really go through well and because of that the docker instance - 'overlay-gatewaynode' didn't get created. Did you hit any error during the run of './install.sh' ? How many nodes do you have in the Docker cluster ? & what is the OS and docker version on these servers ? You may have to re-run the 'install.sh' script.
... View more
01-02-2017
11:58 AM
4 Kudos
PROBLEM
Zookeeper
transaction logs and snapshot files are created very frequently (multiple files
in every minute) and that fills up the FileSystem in a very short time.
ROOT CAUSE
One or more
application are creating or modifying the znodes too frequently, causing too
many transactions in a short duration. This leads to the creation of too many
transactional log files and snapshot files since they get rolled over after
100,000 entries by default (as defined by zookeeper property 'snapCount')
RESOLUTION
The resolution for
such cases involves reviewing the zookeeper transaction logs to find the znodes
that are updated/created most frequently using the following command on one of
the zookeeper servers:
# cd /usr/hdp/current/zookeeper-server
# java -cp zookeeper.jar:lib/* org.apache.zookeeper.server.LogFormatter /hadoop/zookeeper/version-2/logxxx
(where 'dataDir' is set to '/hadoop/zookeeper' within zookeeper configuration)
Once the frequently
updating znodes are identified using the above command, one should continue
with fixing the related application that is creating such a large number of
updates on zookeeper.
An example of such
an application that can cause this problem is Hbase, when there are very large
number of regions stuck in transition and they repeatedly fail to
become online.
... View more
- Find more articles tagged with:
- Hadoop Core
- HBase
- Issue Resolution
- Zookeeper
Labels:
01-02-2017
11:44 AM
3 Kudos
Repo DescriptionRepo Info Github Repo URL https://github.com/rmaruthiyodan/docker-hdp-lab Github account name rmaruthiyodan Repo name docker-hdp-lab
... View more
- Find more articles tagged with:
- docker
- multi-node
- Sandbox & Learning
- utilities
Labels:
08-31-2016
06:10 AM
Looks like the hbase configuration is not picked by the phoenix client. Please check by setting either of the following environment variables: export HBASE_CONF_DIR=/etc/hbase/conf Or export HBASE_CONF_PATH=/etc/hbase/conf
... View more
05-02-2016
09:21 AM
Hi Sumit, You may also want to verify that the ulimit that is set, is actually applied to the process : # cat /proc/<Region Server PID>/limits It is possible that somehow the user limits are overridden when the process starts up.
... View more
12-07-2015
05:34 AM
1 Kudo
Is it supported to modify an existing cluster's NameNode logical name (dfs.nameservices) in an HA configuration ? I was able to get dfs.nameservices renamed using the following steps, but I want to confirm if this could have some issues that I'm unaware of at this time. The steps involved re-creating the NN HA znode, re-initializing the shared edits for the journal node and performing the bootstrap again for the standby NameNode: (run the commands as hdfs user) 1) Turn on safemode
$ hdfs dfsadmin -safemode enter
Safe mode is ON in rm-hdp23n1.novalocal/172.25.17.33:8020
Safe mode is ON in rm-hdp23n3.novalocal/172.25.16.71:8020 2) Perform namenode checkpointing:
$ hdfs dfsadmin -saveNamespace
Save namespace successful for rm-hdp23n1.novalocal/172.25.17.33:8020
Save namespace successful for rm-hdp23n3.novalocal/172.25.16.71:8020 3) From Ambari stop all HDFS services. 4) Make the appropriate changes with the properties. Update "fs.defaultFS" in core-file.xml, And then all the properties in hdfs-site.xml that are related to the HA servicename be modified. For instance, in my cluster I changed the following properties or their values: "dfs.client.failover.proxy.provider.cluster456" : "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider",
"dfs.ha.namenodes.cluster456" : "nn1,nn2", "dfs.namenode.http-address.cluster456.nn1" : "rm-hdp23n1.novalocal:50070",
"dfs.namenode.http-address.cluster456.nn2" : "rm-hdp23n3.novalocal:50070",
"dfs.namenode.https-address.cluster456.nn1" : "rm-hdp23n1.novalocal:50470",
"dfs.namenode.https-address.cluster456.nn2" : "rm-hdp23n3.novalocal:50470",
"dfs.namenode.rpc-address.cluster456.nn1" : "rm-hdp23n1.novalocal:8020",
"dfs.namenode.rpc-address.cluster456.nn2" : "rm-hdp23n3.novalocal:8020", "dfs.namenode.shared.edits.dir" : "qjournal://rm-hdp23n2.novalocal:8485;rm-hdp23n3.novalocal:8485;rm-hdp23n1.novalocal:8485/cluster456",
"dfs.nameservices" : "cluster456", 5) And then start only the journal nodes
(With ZKFC and both NameNodes still in Stopped state) and re-initialize shared edits:
$ hdfs namenode -initializeSharedEdits -force 6) initialize zk node for NN HA:
hdfs zkfc -formatZK -force 7) And then start the NameNodes and zkfc on both the nodes.
... View more
Labels:
- Labels:
-
Apache Hadoop
11-05-2015
05:07 AM
Env: HDP 2.1.10 / NameNode HA Randomly the Active NameNode hits this issue where the zkfc loses connection to the NN's 8020 port. The process appears to be hung at this time and since the fencing does not actually kill the NN process, the NN later on wakes up shuts down itself after it attempts to write to to the Journal nodes, which fails since the journal node's epoch got incremented once the previously standby NN became Active. For ref. attached the log excerpts from NNs, ZKFCs and a journal node. sequence-of-events-1.txt We don't find much time spent during GC at the time of problem. The issue was encountered thrice on 28-10 and we have the logs for two of the occurance: 10:17 and 11:19 And we find the following log entries in the NN log at around the time of problem: (1)
2015-10-28 10:15:54,718 INFO hdfs.StateChange (FSNamesystem.java:completeFile(3004)) - DIR* completeFile: /data/jvc/prd02/raw/geact/current/.distcp.tmp.at
tempt_1437114942510_80591_m_000002_0 is closed by DFSClient_attempt_1437114942510_80591_m_000002_0_1495077679_1
2015-10-28 10:15:54,911 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(177)) - Rescanning after 30001 milliseconds
2015-10-28 10:17:00,577 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(202)) - Scanned 7859 directive(s) and 128194 block(s) in 65666 millisecond(s).
2015-10-28 10:17:00,578 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(177)) - Rescanning after 65666 milliseconds
2015-10-28 10:17:00,578 INFO ipc.Server (Server.java:run(1990)) - IPC Server handler 30 on 8020: skipped org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus from <IP-A>:53951 Call#400424 Retry#0
2015-10-28 10:17:00,578 INFO ipc.Server (Server.java:run(1990)) - IPC Server handler 2 on 8020: skipped org.apache.hadoop.hdfs.server.protocol.NamenodeProtocol.rollEditLog from <IP-B>:52014 Call#47097 Retry#0
2015-10-28 10:17:00,578 INFO ipc.Server (Server.java:run(1990)) - IPC Server handler 21 on 8020: skipped org.apache.hadoop.ha.HAServiceProtocol.transitionToStandby from <IP-B>:52031 Call#413318 Retry#0
(2) 2015-10-28 11:18:47,007 DEBUG BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1407)) - BLOCK* neededReplications = 0 pendingReplications = 0
2015-10-28 11:18:47,906 INFO namenode.FSNamesystem (FSNamesystem.java:listCorruptFileBlocks(6216)) - list corrupt file blocks returned: 0
2015-10-28 11:18:48,387 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(177)) - Rescanning after 30001 milliseconds
2015-10-28 11:19:32,560 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(202)) - Scanned 7859 directive(s) and 128182 block(s) in 44172 millisecond(s).
2015-10-28 11:19:32,560 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(177)) - Rescanning after 44172 milliseconds
2015-10-28 11:20:10,493 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(202)) - Scanned 7859 directive(s) and 128173 block(s) in 37934 millisecond(s).
2015-10-28 11:20:10,493 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(177)) - Rescanning after 37934 milliseconds
2015-10-28 11:20:10,495 INFO ipc.Server (Server.java:run(1990)) - IPC Server handler 5 on 8020: skipped org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus from <IP-A>:58463 Call#406856 Retry#0
2015-10-28 11:20:10,496 INFO namenode.FSNamesystem (FSNamesystem.java:listCorruptFileBlocks(6216)) - list corrupt file blocks returned: 0
2015-10-28 11:20:48,399 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(202)) - Scanned 7859 directive(s) and 128173 block(s) in 37906 millisecond(s).
2015-10-28 11:20:48,399 INFO blockmanagement.CacheReplicationMonitor (CacheReplicationMonitor.java:run(177)) - Rescanning after 37906 milliseconds
2015-10-28 11:20:48,400 DEBUG BlockStateChange (BlockManager.java:computeReplicationWorkForBlocks(1407)) - BLOCK* neededReplications = 0 pendingReplications = 0
2015-10-28 11:20:48,401 INFO namenode.FSNamesystem (FSNamesystem.java:listCorruptFileBlocks(6216)) - list corrupt file blocks returned: 0
- The HDFS Audit log shows the operations/min stats as :
(1)
10768 2015-10-28 10:11
1068 2015-10-28 10:12
26569 2015-10-28 10:13
6003 2015-10-28 10:14
8305 2015-10-28 10:15
46 2015-10-28 10:17
28498 2015-10-28 10:26
(2)
15098 2015-10-28 11:15
20219 2015-10-28 11:16
21364 2015-10-28 11:17
17884 2015-10-28 11:18
22 2015-10-28 11:19
- But can the blockmanagement.CacheReplicationMonitor scan cause this problem ?
(note that the problem is not seen during all the blockmanagement.CacheReplicationMonitor scan operations)
... View more
- Tags:
- Hadoop Core
- HDFS
Labels:
- Labels:
-
Apache Hadoop
10-28-2015
08:44 AM
This is secure HDP 2.3 cluster. And zookeeper services run as non-default service user. Is it supported to configure a kerberized kafka cluster to connect with zookeepers having non-default service users ?
... View more
Labels:
- Labels:
-
Apache Kafka
10-07-2015
03:27 AM
@aagarwal@hortonworks.com Thanks for confirming and the details.
... View more
10-05-2015
06:03 AM
Referring to the guidelines at "http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html" , there are properties required to be configured (dfs.namenode.http-bind-host and dfs.namenode.https-bind-host) which got introduced in Hadoop ver. 2.5 (ref. https://issues.apache.org/jira/browse/HDFS-6273) Since HDP 2.1.10 is based on Hadoop 2.4, does it support configuring multi-homed cluster in such an environment ?
... View more
- Tags:
- Hadoop Core
- HDFS
Labels:
- Labels:
-
Apache Hadoop