Member since
06-16-2016
43
Posts
22
Kudos Received
0
Solutions
01-10-2018
08:14 PM
1 Kudo
In large clusters, where there are multiple services which makes use of single Zookeeper quorum, the state store is maintained as znodes. Hence the count of such znodes are directly proportional to the services that are deployed and also the activity on the cluster.
If LLAP apps are deployed in such clusters, it is imperative that slider is enabled (by setting the property, "hadoop.registry.rm.enabled"), this will introduce an overhead in the Znode scans for all the application containers that are created and destroyed on timely basis. The behavior of the scans are as described below, If the property is set in core-site.xml or yarn-site.xml, the YARN Resource Manager will behave as follows: 1. On startup: create the initial root paths of /, /services and /users. On a secure cluster, access will be restricted to the system accounts (see below). 2. When a user submits a job: create the user path under /users. 3. When a container is completed: delete from the registry all service records with a yarn:persistence field of value container, and a yarn:id field whose value matches the ID of the completed container. 4. When an application attempt is completed: remove all service records with yarn:persistence set to application-attempt and yarn:id set to the pplication attempt ID. 5. When an application finishes: remove all service records with yarn:persistence set to application and yarn:id set to the application ID. Ref: Registry scan Hence, this leads to registry scan across all the znodes irrespective of rmservice znode. Meaning, even if there are few thousand (<10K) of applications in /rmstore (/rmstore-secure), the scan would be from root level (/). If the count of znodes under root exceeds 10k limit, this leads to registry scan and hence the connectivity issues between ZK and RM which leads to timeout and hence RM failover and hence its stability. This is addressed in this Apache JIRA. ROOT CAUSE: https://issues.apache.org/jira/browse/YARN-6136 RESOLUTION: To implement change in the ZK scan behavior. WORKAROUND: 1. If LLAP (slider) is not used: Disable, hadoop.registry.rm.enabled 2. If LLAP (slider) is used: i) Assume only LLAP uses slider, if nobody else is using the same ZK cluster, the only way to reduce ZK load is lower yarn.resourcemanager.state-store.max-completed-applications to 3k ii) If other services use ZK quorum, please reach out to HWX support.
... View more
Labels:
09-14-2017
09:28 PM
It is expected in large clusters where node count ranges to few hundreds, the master services tend to be busy. One such master service is Namenode. Some of the critical activities that NN does includes, 1. Addressing client requests which includes verifying proper permissions, auth checks for HDFS resources. 2. Constant block report monitoring from all the Datanodes. 3. Updating the service and audit logs. are to name a few. In certain situations when there are rogue applications which tries to access multiple resources in HDFS or a data ingestion that is trying to load high data volumes, NN tends to be very busy. In such situations and cluster like these NN FSImage tends to be in $$GB. Hence, operations such as checkpointing would consume considerable bandwidth across the two Namenodes. Hence, high volume of edits sync along with loggings would cause high disk utilization which can lead to NameNode instability. Hence, it is recommended to have dedicated disks for service logs and edit logs. We can monitor the IO on the disks using `iostat` output.
... View more
Labels:
08-08-2017
12:49 AM
3 Kudos
This article tries to compare the data recovery period of accidentally deleted data in HDFS. We would compare two scenarios,
1. When trash is enabled.
2. When snapshot is enabled.
Data Recovery from trash:
When a data from HDFS is deleted, metadata in HDFS is updated to delete the file from the source folder. However, the blocks from the datanode is not immediately deleted. The trash folder in HDFS is updated with the file along with the directory from where it is deleted in the user's .trash folder. The deleted data could be recovered from the trash folder.
Example: 1. Existing data in HDFS. #hadoop fs -ls /tmp/test1.txt
-rw-r--r-- 3 hdfs hdfs 4 2017-08-07 23:47 /tmp/test1.txt
2. Deleted data in HDFS. #hadoop fs -rm /tmp/test1.txt
17/08/07 23:52:13 INFO fs.TrashPolicyDefault: Moved: 'hdfs://vnn/tmp/test1.txt' to trash at: hdfs://vnn/user/hdfs/.Trash/Current/tmp/test1.txt
3. Recovering a deleted data #hadoop fs -cp /user/hdfs/.Trash/Current/tmp/test1.txt /tmp/
#hadoop fs -ls /tmp/test1.txt
-rw-r--r-- 3 hdfs hdfs 4 2017-08-07 23:57 /tmp/test1.txt
Data recovery from snapshots: Snapshots are read-only point in time copies of HDFS file system. Enable a directory to be snapshot-able to recovery any accidental data loss. 1. Enabling snapshot. #hdfs dfsadmin -allowSnapshot /tmp/snapshotdir
Allowing snaphot on /tmp/snapshotdir succeeded
2. Create snapshot for a directory. #hdfs dfs -createSnapshot /tmp/snapshotdir
Created snapshot /tmp/snapshotdir/.snapshot/s20170807-180139.568 3. Contents of HDFS snapshot based folder. #hdfs dfs -ls /tmp/snapshotdir/
Found 3 items
hadoop fs -rm $1
-rw-r--r-- 3 hdfs hdfs 1083492818 2017-07-31 19:01 /tmp/snapshotdir/oneGB.csv
-rw-r--r-- 3 hdfs hdfs 10722068505 2017-08-02 17:19 /tmp/snapshotdir/tenGB.csv
#hdfs dfs -ls /tmp/snapshotdir/.snapshot/s20170807-180139.568
Found 3 items
-rw-r--r-- 3 hdfs hdfs 1083492818 2017-07-31 19:01 /tmp/snapshotdir/.snapshot/s20170807-180139.568/oneGB.csv
-rw-r--r-- 3 hdfs hdfs 10722068505 2017-08-02 17:19 /tmp/snapshotdir/.snapshot/s20170807-180139.568/tenGB.csv
4. Delete and recovering lost data. #hadoop fs -rm /tmp/snapshotdir/oneGB.csv
17/08/07 19:37:46 INFO fs.TrashPolicyDefault: Moved: 'hdfs://vinodnn/tmp/snapshotdir/oneGB.csv' to trash at: hdfs://vinodnn/user/hdfs/.Trash/Current/tmp/snapshotdir/oneGB.csv1502134666492
#hadoop fs -cp /tmp/snapshotdir/.snapshot/s20170807-180139.568/oneGB.csv /tmp/snapshotdir/
It is seen in the above methods that hadoop copy "hadoop fs -cp <source> <dest>" is used to recover the data. However, the time taken by "cp" operation would increase as the size of the lost data increases. One of the optimizations would be to use the move command, "hadoop fs -mv <source> <destination>" in place of copy operation, as former operation fairs better over latter. Since, snapshot folders are read-only, the only supported operation is "copy" ( but not move ). Following are the metrics that are used to compare the performance of "copy" operation over "move" for one GB and ten GB data file. Time to recover a file using copy (cp) operations: screen-shot-2017-08-07-at-60552-pm.png Time to recover a file using move (mv) operations: screen-shot-2017-08-07-at-60602-pm.png Hence, we observe that recovery of data using trash along with move operation is efficient in certain cases to tackle accidental data loss and recovery. NOTE: Recovering the data from trash would be possible if trash interval (fs.trash.interval) are properly configured to give Hadoop admins enough time to detect the data loss and recover it. If not, snapshot would be recommended for eventual recovery.
... View more
Labels:
05-26-2017
12:58 AM
1 Kudo
ISSUE: While configuring NFS mounts to access HDFS as a part of local FS, we do tend to control the access using nfs proxies as shown below, <property>
<name>hadoop.proxyuser.nfsserver.groups</name>
<value>nfs-users1,nfs-users2</value>
<description>
The 'nfsserver' user is allowed to proxy all members of the
'nfs-users1' and 'nfs-users2' groups. Set this to '*' to allow
nfsserver user to proxy any group.
</description>
</property>
<property>
<name>hadoop.proxyuser.nfsserver.hosts</name>
<value>nfs-client-host1.com</value>
<description>
This is the host where the nfs gateway is running. Set this to
'*' to allow requests from any hosts to be proxied.
</description>
</property> However, a user who has access to NFS server would be able to access (view) the HDFS file system even if they are not part of "hadoop.proxyuser.nfsserver.groups" and "hadoop.proxyuser.nfsserver.hosts" . This may be a security flaw in certain scenarios, ROOT CAUSE: This is due to a property, "nfs.exports.allowed.hosts" which is used to allow the access to the HDFS from the hosts. RESOLUTION: Make sure the desired hosts and permissions are assigned to HDFS. Permissions for the property can be defined as below, <property>
<name>nfs.exports.allowed.hosts</name>
<value>* rw</value>
</property> NOTE: NFS gateway restart may be needed if the property is altered Links: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html#Allow_mounts_from_unprivileged_clients
... View more
Labels:
05-22-2017
06:52 PM
1 Kudo
Apart from checking the topologies from a Storm WebUI, we can also list the active topologies from one of the cluster nodes, We can use the following command to list the same, /usr/hdp/<HDP-version>/storm/bin/storm list If there are no topologies running, we would get an output as follows, No topologies running.
... View more
Labels:
04-07-2017
10:54 PM
1 Kudo
ISSUE: Java Heap Space issue in Hive MR engine
While working on a sample data set in hive. Query such as "select count(*)" was seen to fail with below error. Starting Job = job_1491603076412_0001, Tracking URL = http://krishna3.openstacklocal:8088/proxy/application_1491603076412_0001/
Kill Command = /usr/hdp/2.4.2.0-258/hadoop/bin/hadoop job -kill job_1491603076412_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2017-04-07 22:18:09,736 Stage-1 map = 0%, reduce = 0%
2017-04-07 22:18:46,065 Stage-1 map = 100%, reduce = 100%
Ended Job = job_1491603076412_0001 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1491603076412_0001_m_000000 (and more) from job job_1491603076412_0001
Task with the most failures(4):
-----
Task ID:
task_1491603076412_0001_m_000000
URL:
http://krishna3.openstacklocal:8088/taskdetails.jsp?jobid=job_1491603076412_0001&tipid=task_1491603076412_0001_m_000000
-----
Diagnostic Messages for this Task:
Error: Java heap space
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
Checking the corresponding application logs, we observe that 2017-04-07 22:25:40,828 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:986)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:402)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:442)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162) ROOT CAUSE: Insufficient HeapSpace in MR Engine for mapreduce.map.memory.mb RESOLUTION: Increasing mapreduce.map.memory.mb from 1.2G to 1.75G and hence increasing the mapreduce.task.io.sort.mb to 1003 mapreduce.map.java.opts to -Xmx1433m and restarting the necessary services did resolve the problem. (NOTE: mapreduce.task.io.sort.mb and mapreduce.map.java.opts value recommendations were made by Ambari )
... View more
Labels:
03-30-2017
06:52 AM
ISSUE: FATAL Fatal error during KafkaServerStartable startup. Prepare to shutdown (kafka.server.KafkaServerStartable)
kafka.common.KafkaException: Failed to acquire lock on file .lock in /opt/kafka/log. A Kafka instance in another process or thread is using this directory.
at kafka.log.LogManager$anonfun$lockLogDirs$1.apply(LogManager.scala:98)
at kafka.log.LogManager$anonfun$lockLogDirs$1.apply(LogManager.scala:95)
at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at kafka.log.LogManager.lockLogDirs(LogManager.scala:95)
at kafka.log.LogManager.<init>(LogManager.scala:57)
at kafka.server.KafkaServer.createLogManager(KafkaServer.scala:589)
at kafka.server.KafkaServer.startup(KafkaServer.scala:171)
at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37)
at kafka.Kafka$.main(Kafka.scala:67)
at kafka.Kafka.main(Kafka.scala) [2017-03-30 06:32:06,504] INFO shutting down (kafka.server.KafkaServer) RESOLUTION: move .lock file from /opt/kafka/log to a temporary location and restart the kafka broker on the node.
... View more
Labels:
03-21-2017
05:28 PM
1 Kudo
It is known that in a secure Kafka environment, we need to explicitly authorize user/principal to either read/write to a Kafka topic as shown below Grant Read/Write Access to a Topic To add the following ACL: "Principals user:bob and user:alice are allowed to perform Operation Read and Write on Topic Test-Topic from Host1 and Host2" run the CLI with the following options: bin/kafka-acls.sh --add --allow-principal <strong><em>User</em></strong>:bob --allow-principal User:alice --allow-host host1 --allow-host host2 --operation Read --operation Write --topic test-topic Grant Full Access to Topic, Cluster, and Consumer Group To add ACLs to a topic, specify --topic <topic-name> as the resource option. Similarly, to add ACLs to cluster, specify --cluster ; to add ACLs to a consumer group, specify --consumer-group <group-name> . The following examples grant full access for principal bob to topic test-topic and consumer group 10 , across the cluster. Substitute your own values for principal name, topic name, and group name. bin/kafka-acls.sh --topic test-topic --add --allow-principal <strong><em>user</em></strong>:bob --operation ALL --config /usr/hdp/current/kafka-broker/config/server.properties Ref:https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.2/bk_secure-kafka-ambari/content/kafka-acl-examples.html ISSUE: We see that the keyword, "user" is inconsistent across the commands leading to improper authorizations getting effected. RESOLUTION: Usage of keyword, "User" is case sensitive and it is dependent on the version of Kafka that is being used, Apache Kafka 0.8.2 makes it necessary to use the keyword "User" and Apache Kafka 0.9 onwards, keyword "user" needs to be used for authorizations.
... View more
Labels:
02-16-2017
02:58 AM
ISSUE:
We did see that Knox gateway servers failing to return Load balancer url after submitting a WebHDFS commands.
It could be seen that the logs in the server as below, 2017-02-15 20:15:51,050 DEBUG hadoop.gateway (UrlRewriteProcessor.java:rewrite(157)) - Rewrote URL: http://<Gateway-server-hostname>:50075/webhdfs/v1/user/<username>/test2?op=CREATE&user.name=<username>&namenoderpcaddress=<NN-server-hostname>:8020&overwrite=false, direction: OUT via explicit rule: WEBHDFS/webhdfs/outbound/namenode/headers/location to URL: https://<Gateway-server-hostname>:8443/gateway/production/webhdfs/data/v1/webhdfs/v1/user/<username>/test2?_=AAAACAAAABAAAACwXrr5dePzjWo4CD7w6g—lwAqK25Z-yUGo9MJf3qOHlOPn-oZzMWN3qF17Me78ia7H00bVqhPCLVCZNEbeoRY9Sct1cEkfqtmuqyWnj5LI68GDrc7iKr9loQheBkXuceCE4nf-9zXLqE-m8CVtdQQyxMSQnxcZMaAIPesoLoWDWOVXAgGLzWeVMs—uafWTPORGv5KRqok61gSU_KtCt4_9Igcoa1RrpnFEDtusyUHD9osMP612VJdu4ggzJJaVtLdg_btQhxw20 CAUSE:
It is seen that corrupted deployment director (/var/lib/knox/data/deployment) can lead to this issue. RESOLUTION: 1. Take a backup of the deployment directory 2. Delete the directory contents 3. Restart Knox server (recreates the contents).
... View more
Labels:
- « Previous
-
- 1
- 2
- Next »