Member since
08-08-2017
1652
Posts
30
Kudos Received
11
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 1998 | 06-15-2020 05:23 AM | |
| 16446 | 01-30-2020 08:04 PM | |
| 2146 | 07-07-2019 09:06 PM | |
| 8340 | 01-27-2018 10:17 PM | |
| 4728 | 12-31-2017 10:12 PM |
02-03-2020
01:28 PM
1 Kudo
Dear Jay what to say excellent answer , you are really one of the best here
... View more
01-31-2020
09:06 AM
@mike_bronson7 Kafka broker needs at least the following number of file descriptors to just track log segment files: (number of partitions)*(partition size / segment size) You can review the current limits configuration under: cat /proc/<kafka_pid>/limits If you want to change them, if you're using ambari console you can go to > Kafka > config > and search for "kafka_user_nofile_limit" Finally, To see open file descriptors, run: lsof -p KAFKA_BROKER_PID
... View more
01-30-2020
08:04 PM
Dear Jay finally we found the issue it was about the mistake in /etc/hosts file instead of 127.0.0.1 , it was the ip address - 27.0.0.1 so we fix it and restart the postgresql and ambari now all are fine
... View more
01-28-2020
02:00 PM
Jay - can you help me with this post - https://community.cloudera.com/t5/Support-Questions/how-to-recover-bad-namenode-from-good-namenode/td-p/288471
... View more
01-22-2020
06:06 AM
SHORT Cloudera has broken zookeeper 3.4.5-cdh5.4.0 in several places. Service is working but CLI is dead. No workaround other than rollback. LONG Assign a bounty on this ;-). I have stepped on this mine too and was angry enough to find the reason: Zookeeper checks JLine during ZooKeeperMain.run(). There is a try-catch block that loads a number of classes. Any exception during class loading fails the whole block and JLine support is reported to be disabled. But here is why this happens with CDH 5.4.0: Current opensource Zookeeper-3.4.6 works against jline-0.9.94. Has no such issue. In CDH 5.4 Cloudera has applied the following patch: roman@node4:$ diff zookeeper-3.4.5-cdh5.3.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java zookeeper-3.4.5-cdh5.4.0/src/java/main/org/apache/zookeeper/ZooKeeperMain.java
305,306c305,306
< Class consoleC = Class.forName("jline.ConsoleReader");
< Class completorC =
---
> Class consoleC = Class.forName("jline.ConsoleReader");
> Class completorC =
316,317c316,317
< Method addCompletor = consoleC.getMethod("addCompletor",
< Class.forName("jline.Completor"));
---
> Method addCompletor = consoleC.getMethod("addCompleter",
> Class.forName("jline.console.completer.Completer"));
CDH 5.4 uses jline-2.11.jar for ZooKeeper and it has no jline.ConsoleReader class (from 2.11 it is jline.console.ConsoleReader). Jline 0.9.94 in turn has no jline.console.completer.Completer. So there is incompatibility with any existing JLine. Any Cloudera CDH 5.4 user can run zookeeper-client on his/her cluster and find it does not work. Open-source zookeeper-3.4.6 depends on jline-0.9.94 which has no such patches. Don't know why Cloudera engineers have done such a mine. I see no clean way to fix it with 3.4.5-cdh5.4.0. I stayed with 3.4.5-cdh5.3.3 dependency where I need CLI and have production clusters. It seemed to me both jline-0.9.94.jar and jline.2.11.jar in classpath for zookeeper will fix the problem. But just have found Cloudera made another 'fix' in ZK for CDH 5.4.0, they have renamed org.apache.zookeeper.JLineZNodeCompletor class to org.apache.zookeeper.JLineZNodeCompleter. But here is the code from ZooKeeperMain.java Class<?> completorC = Class.forName("org.apache.zookeeper.JLineZNodeCompletor"); And of course, it means practically it is not possible to start ZK CLI in CDH 5.4.0 proper way. Awful work. 😞
... View more
01-12-2020
12:56 PM
1 Kudo
@mike_bronson7 When your cluster is in HA it uses a namespace that acts as a load balancer to facilitate the switch from active to standby and vice versa. The dfs-site-xml holds these values filter using dfs.nameservices the nameservice-id should be your namespace or in HA look for dfs.ha.namenodes.[nameservice ID] dfs.ha.namenodes.[nameservice ID] e.g dfs.ha.namenodes.mycluster And that's the value to set eg hdfs://mycluster_namespace/user/ams/hbase The refresh the stale configs , now HBase should sending the metrics to that directory HTH
... View more
01-07-2020
01:41 AM
Hi, We understand that logs are not getting deleted even though you had enabled spark.history.fs properties. Did you found any errors in SHS logs with regarding to this? Thanks AKR
... View more
01-05-2020
05:39 AM
Hi, if there are more no of files are present in spark history Server, then FS would not be working as expected. In that case, We may need to write a script to delete the old files that are more then 7 days ( or as per your requirement) from the Spark history server location . Thanks Arun
... View more
12-08-2019
12:37 PM
2 Kudos
@mike_bronson7 1. You can get the List of KAFKA Broker Hosts (hostnames) using the following API call. # curl -iv -u admin:admin -H "X-Requested-By: ambari" -X GET http://$AMBARI_HOST:8080/api/v1/clusters/TestCluster/services/KAFKA/components/KAFKA_BROKER?fields=host_components/HostRoles/host_name 2. Once you know/decide the Hostname (For example: 'kafkabroker5.example.com') in which you want to stop/start the Kafka Broker then you can try the following: . A. To Stop Kafka Broker on Host 'kafkabroker5.example.com' # curl -iv -u admin:admin -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo":{"context":"Stop Kafka Broker","operation_level":{"level":"HOST_COMPONENT","cluster_name":"TestCluster","host_name":"kafkabroker5.example.com","service_name":"KAFKA"}},"Body":{"HostRoles":{"state":"INSTALLED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/TestCluster/hosts/kafkabroker5.example.com/host_components/KAFKA_BROKER . B. To Start Kafka Broker on Host 'kafkabroker5.example.com' # curl -iv -u admin:admin -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo":{"context":"Start Kafka Broker","operation_level":{"level":"HOST_COMPONENT","cluster_name":"TestCluster","host_name":"kafkabroker5.example.com","service_name":"KAFKA"}},"Body":{"HostRoles":{"state":"STARTED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/TestCluster/hosts/kafkabroker5.example.com/host_components/KAFKA_BROKER .
... View more
12-03-2019
12:41 PM
1 Kudo
@mike_bronson7 Yes, you are right if the alert state is "OK" means the service is running well usually. If it is WARNING/CRITICAL then we need to look at the alert text and alert host to find out why and in which host the alert is in that state. Basically the Kafka "host" where the alert was triggered, The "state" of the alert like CRITICAL,OK,WARNING and then Alert "text" are usually the important parts of an alert which gives us a good idea on what is happening. So you can capture those selected output using: # curl -u admin:admin -H "X-Requested-By: ambari" -X GET "http://$AMBARI_HOST:8080/api/v1/clusters/NewCluster/alerts?fields=Alert/host_name,Alert/host_name,Alert/state,Alert/text&Alert/service_name=KAFKA" . Example Output: # curl -u admin:admin -H "X-Requested-By: ambari" -X GET "<a href="http://newhwx1.example.com:8080/api/v1/clusters/$CLUSTER_NAME/alerts?fields=Alert/host_name,Alert/host_name,Alert/state,Alert/text&Alert/service_name=KAFKA" target="_blank">http://newhwx1.example.com:8080/api/v1/clusters/$CLUSTER_NAME/alerts?fields=Alert/host_name,Alert/host_name,Alert/state,Alert/text&Alert/service_name=KAFKA</a>"
{
"href" : "<a href="http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts?fields=Alert/host_name,Alert/host_name,Alert/state,Alert/text&Alert/service_name=KAFKA" target="_blank">http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts?fields=Alert/host_name,Alert/host_name,Alert/state,Alert/text&Alert/service_name=KAFKA</a>",
"items" : [
{
"href" : "<a href="http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts/704" target="_blank">http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts/704</a>",
"Alert" : {
"cluster_name" : "NewCluster",
"definition_id" : 401,
"definition_name" : "kafka_broker_process",
"host_name" : "newhwx3.example.com",
"id" : 704,
"service_name" : "KAFKA",
"state" : "OK",
"text" : "TCP OK - 0.000s response on port 6667"
}
},
{
"href" : "<a href="http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts/1201" target="_blank">http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts/1201</a>",
"Alert" : {
"cluster_name" : "NewCluster",
"definition_id" : 401,
"definition_name" : "kafka_broker_process",
"host_name" : "newhwx5.example.com",
"id" : 1201,
"service_name" : "KAFKA",
"state" : "CRITICAL",
"text" : "Connection failed: [Errno 111] Connection refused to newhwx5.example.com:6667"
}
}
]
} .
... View more