Member since
08-08-2017
1652
Posts
30
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1550 | 06-15-2020 05:23 AM | |
10944 | 01-30-2020 08:04 PM | |
1696 | 07-07-2019 09:06 PM | |
7089 | 01-27-2018 10:17 PM | |
3971 | 12-31-2017 10:12 PM |
01-22-2020
06:06 AM
SHORT Cloudera has broken zookeeper 3.4.5-cdh5.4.0 in several places. Service is working but CLI is dead. No workaround other than rollback. LONG Assign a bounty on this ;-). I have stepped on this mine too and was angry enough to find the reason: Zookeeper checks JLine during ZooKeeperMain.run(). There is a try-catch block that loads a number of classes. Any exception during class loading fails the whole block and JLine support is reported to be disabled. But here is why this happens with CDH 5.4.0: Current opensource Zookeeper-3.4.6 works against jline-0.9.94. Has no such issue. In CDH 5.4 Cloudera has applied the following patch: roman@node4:$ diff zookeeper-3.4.5-cdh5.3.3/src/java/main/org/apache/zookeeper/ZooKeeperMain.java zookeeper-3.4.5-cdh5.4.0/src/java/main/org/apache/zookeeper/ZooKeeperMain.java
305,306c305,306
< Class consoleC = Class.forName("jline.ConsoleReader");
< Class completorC =
---
> Class consoleC = Class.forName("jline.ConsoleReader");
> Class completorC =
316,317c316,317
< Method addCompletor = consoleC.getMethod("addCompletor",
< Class.forName("jline.Completor"));
---
> Method addCompletor = consoleC.getMethod("addCompleter",
> Class.forName("jline.console.completer.Completer"));
CDH 5.4 uses jline-2.11.jar for ZooKeeper and it has no jline.ConsoleReader class (from 2.11 it is jline.console.ConsoleReader). Jline 0.9.94 in turn has no jline.console.completer.Completer. So there is incompatibility with any existing JLine. Any Cloudera CDH 5.4 user can run zookeeper-client on his/her cluster and find it does not work. Open-source zookeeper-3.4.6 depends on jline-0.9.94 which has no such patches. Don't know why Cloudera engineers have done such a mine. I see no clean way to fix it with 3.4.5-cdh5.4.0. I stayed with 3.4.5-cdh5.3.3 dependency where I need CLI and have production clusters. It seemed to me both jline-0.9.94.jar and jline.2.11.jar in classpath for zookeeper will fix the problem. But just have found Cloudera made another 'fix' in ZK for CDH 5.4.0, they have renamed org.apache.zookeeper.JLineZNodeCompletor class to org.apache.zookeeper.JLineZNodeCompleter. But here is the code from ZooKeeperMain.java Class<?> completorC = Class.forName("org.apache.zookeeper.JLineZNodeCompletor"); And of course, it means practically it is not possible to start ZK CLI in CDH 5.4.0 proper way. Awful work. 😞
... View more
01-12-2020
12:56 PM
1 Kudo
@mike_bronson7 When your cluster is in HA it uses a namespace that acts as a load balancer to facilitate the switch from active to standby and vice versa. The dfs-site-xml holds these values filter using dfs.nameservices the nameservice-id should be your namespace or in HA look for dfs.ha.namenodes.[nameservice ID] dfs.ha.namenodes.[nameservice ID] e.g dfs.ha.namenodes.mycluster And that's the value to set eg hdfs://mycluster_namespace/user/ams/hbase The refresh the stale configs , now HBase should sending the metrics to that directory HTH
... View more
01-07-2020
01:41 AM
Hi, We understand that logs are not getting deleted even though you had enabled spark.history.fs properties. Did you found any errors in SHS logs with regarding to this? Thanks AKR
... View more
01-05-2020
05:39 AM
Hi, if there are more no of files are present in spark history Server, then FS would not be working as expected. In that case, We may need to write a script to delete the old files that are more then 7 days ( or as per your requirement) from the Spark history server location . Thanks Arun
... View more
12-08-2019
12:37 PM
2 Kudos
@mike_bronson7 1. You can get the List of KAFKA Broker Hosts (hostnames) using the following API call. # curl -iv -u admin:admin -H "X-Requested-By: ambari" -X GET http://$AMBARI_HOST:8080/api/v1/clusters/TestCluster/services/KAFKA/components/KAFKA_BROKER?fields=host_components/HostRoles/host_name 2. Once you know/decide the Hostname (For example: 'kafkabroker5.example.com') in which you want to stop/start the Kafka Broker then you can try the following: . A. To Stop Kafka Broker on Host 'kafkabroker5.example.com' # curl -iv -u admin:admin -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo":{"context":"Stop Kafka Broker","operation_level":{"level":"HOST_COMPONENT","cluster_name":"TestCluster","host_name":"kafkabroker5.example.com","service_name":"KAFKA"}},"Body":{"HostRoles":{"state":"INSTALLED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/TestCluster/hosts/kafkabroker5.example.com/host_components/KAFKA_BROKER . B. To Start Kafka Broker on Host 'kafkabroker5.example.com' # curl -iv -u admin:admin -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo":{"context":"Start Kafka Broker","operation_level":{"level":"HOST_COMPONENT","cluster_name":"TestCluster","host_name":"kafkabroker5.example.com","service_name":"KAFKA"}},"Body":{"HostRoles":{"state":"STARTED"}}}' http://$AMBARI_HOST:8080/api/v1/clusters/TestCluster/hosts/kafkabroker5.example.com/host_components/KAFKA_BROKER .
... View more
12-03-2019
12:41 PM
1 Kudo
@mike_bronson7 Yes, you are right if the alert state is "OK" means the service is running well usually. If it is WARNING/CRITICAL then we need to look at the alert text and alert host to find out why and in which host the alert is in that state. Basically the Kafka "host" where the alert was triggered, The "state" of the alert like CRITICAL,OK,WARNING and then Alert "text" are usually the important parts of an alert which gives us a good idea on what is happening. So you can capture those selected output using: # curl -u admin:admin -H "X-Requested-By: ambari" -X GET "http://$AMBARI_HOST:8080/api/v1/clusters/NewCluster/alerts?fields=Alert/host_name,Alert/host_name,Alert/state,Alert/text&Alert/service_name=KAFKA" . Example Output: # curl -u admin:admin -H "X-Requested-By: ambari" -X GET "<a href="http://newhwx1.example.com:8080/api/v1/clusters/$CLUSTER_NAME/alerts?fields=Alert/host_name,Alert/host_name,Alert/state,Alert/text&Alert/service_name=KAFKA" target="_blank">http://newhwx1.example.com:8080/api/v1/clusters/$CLUSTER_NAME/alerts?fields=Alert/host_name,Alert/host_name,Alert/state,Alert/text&Alert/service_name=KAFKA</a>"
{
"href" : "<a href="http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts?fields=Alert/host_name,Alert/host_name,Alert/state,Alert/text&Alert/service_name=KAFKA" target="_blank">http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts?fields=Alert/host_name,Alert/host_name,Alert/state,Alert/text&Alert/service_name=KAFKA</a>",
"items" : [
{
"href" : "<a href="http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts/704" target="_blank">http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts/704</a>",
"Alert" : {
"cluster_name" : "NewCluster",
"definition_id" : 401,
"definition_name" : "kafka_broker_process",
"host_name" : "newhwx3.example.com",
"id" : 704,
"service_name" : "KAFKA",
"state" : "OK",
"text" : "TCP OK - 0.000s response on port 6667"
}
},
{
"href" : "<a href="http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts/1201" target="_blank">http://newhwx1.example.com:8080/api/v1/clusters/NewCluster/alerts/1201</a>",
"Alert" : {
"cluster_name" : "NewCluster",
"definition_id" : 401,
"definition_name" : "kafka_broker_process",
"host_name" : "newhwx5.example.com",
"id" : 1201,
"service_name" : "KAFKA",
"state" : "CRITICAL",
"text" : "Connection failed: [Errno 111] Connection refused to newhwx5.example.com:6667"
}
}
]
} .
... View more
12-02-2019
02:06 PM
@mike_bronson7 You can change the ownership of the HDFS directory to airflow:hadoop please do run the -chown command on / ??? It should something like /users/airflow/xxx Please let me know
... View more
11-27-2019
07:30 AM
Dear Shelton this is very very strange that some application / other delete the files ( there are 4 files while 2 of them represented the md5 chksum ) what we are very worry from this case , is that we actuality not know the real reason why files deleted and we cant reverse and see whats happens
... View more
11-26-2019
02:09 PM
ok so after second thinking I think its better to use the latest version that supported - 7.6 instead of 7.5 rhel 7.6 can support HDP 2.6.4 and up ( latest HDP version ) rhel 7.6 can support 2.6.2.2 , 2.7.3 , 2.7.4 ambari version and when current rhel version is 7.2 and we want to add rhel 7.6 OS version then its fully recommended to upgrade 7.2 to 7.6
... View more