Member since
05-09-2017
107
Posts
7
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3958 | 03-19-2020 01:30 PM | |
20345 | 11-27-2019 08:22 AM | |
10298 | 07-05-2019 08:21 AM | |
17260 | 09-25-2018 12:09 PM | |
6603 | 08-10-2018 07:46 AM |
12-06-2018
01:33 PM
1 Kudo
Yes we only tried deleting the out-of-sync partition. It did not work. After a lot of research we came to a conclusion to increase replica.lag.time.max.ms to 8 days. As its been around 8 days that a few replicas were out of sync. This resolved our issue and while it took a few hours for followers to fetch and replicate the 7 days of data. https://www.confluent.io/blog/hands-free-kafka-replication-a-lesson-in-operational-simplicity/ helped to understand the ISR's
... View more
11-18-2018
05:38 AM
Thank you @bgooley Thats what i did and it worked. thank you and appreciate your response.
... View more
11-16-2018
07:25 AM
2 Kudos
This was an issue with that consumer group in __consumer_offsets adn these were the steps we did to fix this issue On a single broker run the below 1) find /kafka/data -name "*.log" | grep -i consumer | awk '{a=$1;b="kafka-run-class kafka.tools.DumpLogSegments --deep-iteration --print-data-log -files "a; print b}' Now run each and every command on xxxx broker to see which log file has consumer group "prod-abc-events" 2) kafka-run-class kafka.tools.DumpLogSegments --deep-iteration --print-data-log -files /kafka/data/sdc/__consumer_offsets-24/00000000000000000000.log | grep -i 'prod-abc-events Do steps above on all the brokers and make a list of all the files that have 'prod-abc-events' . In our instance we found 3 files that refrenced this group "prod-abc-events' broker1: /kafka/data/sda/__consumer_offsets-24/00000000000000000000.log broker2: /kafka/data/sdc/__consumer_offsets-24/00000000000000000000.log broker3: /kafka/data/sdc/__consumer_offsets-24/00000000000000000000.log We noticed that the .log file on broker1 was different in size and content from the remaining two. We backed up the file from broker1 and then replaced it with the one from broker2 . and that has resolved this issue. Most likely this happened to us when we ran kafka-reassign-partitions and drives reached 99% and then something broke in _consumer_offsets.
... View more
11-16-2018
07:11 AM
1 Kudo
var/lib/kms-keytrustee/keytrustee/.keytrustee folder on both the kms hosts should match and unfortunately they do not in our cluster, So if a key create request goes to one kms host and retrieval goes to another kms host the command fails. [root@host]# md5sum /var/lib/kms-keytrustee/keytrustee/.keytrustee/secring.gpg fec74c82e3da7f04f2acd36a937072b5 /var/lib/kms-keytrustee/keytrustee/.keytrustee/secring.gpg [root@host]# md5sum /var/lib/kms-keytrustee/keytrustee/.keytrustee/secring.gpg 88483e6a8ee1d245d3c83b740fd43683 /var/lib/kms-keytrustee/keytrustee/.keytrustee/secring.gpg Used bdr tool to take a back up of encrypted zones in the same cluster, purged all keys, dropped all zones. Used rsync to sync /var/lib/kms-keytrustee/keytrustee/.keytrustee on both kms hosts, created all keys, zones and used bdr to restore the data from backup. Everything looks good now!!
... View more
10-12-2018
09:40 AM
@desind, If none of your clients is breaking and everything looks healthy in Cloudera Manager, then it may not be necessary to dig deeper at this time. If you do want to, you could do a tcpdump on port 7183 on your CM host... let it run for a bit then read it in WireShark to try to track down which SSL handshakes are failing and what the client is.
... View more
09-25-2018
12:09 PM
I was able to resolve this issue by moving the user and group under one OU. I think most likely it cannot do a backward search. Thank you @bgooley much appreciated. I learnt a few things in this process.
... View more
09-18-2018
08:41 AM
4 Kudos
@desind, There are a few ways to enable DEBUG or TRACE depending on what sort of problem you are attempting to troubleshoot. (1) If CM won't start or if there is a problem where you do not have an idea what classes are involved, you can enable DEBUG or TRACE for the whole server. Warning: this can be very very verbose, so it is likely going to be difficult to capture an event. - Edit /usr/sbin/cmf-server in CM 5--- Edit /opt/cloudera/cm/bin/cm-server in CM 6 - Change this: export CMF_ROOT_LOGGER="INFO,LOGFILE" to export CMF_ROOT_LOGGER="DEBUG,LOGFILE" Restart CM to have the change apply. (2) If you know what class or package you want to DEBUG, you can edit /etc/cloudera-scm-server/log4j.properties: Add lines as follows... this is an example of turning on debugging for just ldap classes in SpringFramework (used in LDAP authentication): log4j.logger.org.springframework.ldap=TRACE log4j.logger.org.springframework.security.ldap=TRACE Restart CM to have the changes apply (3) If you want to turn on some debug or trace level logging for just the current session of Cloudera Manager, you can use the debug page: https://cm_host:cm_port/cmf/debug/logLevel - Choose the Logger from the drop-down - Select the level to which you want to change the logging - Click "Submit Query" button to apply The log level you selected will only apply until you restart Cloudera Manager (4) API debugging. You can enable API debugging in the Cloudera Manager interface: - Navigate to: Administration --> Settings - Search for Enable Debugging of API - Check the box next to it and Save API debugging will be written to the /var/log/cloudra-scm-server/cloudera-scm-server.log file without restart. (5) NOTE: If you do enable verbose debugging, you may need to increase the size of log files or the number to be able to review relevant lines. To do so, I believe you can simply edit the following in /etc/cloudera-scm-server/log4j.properties: log4j.appender.LOGFILE.MaxFileSize=10MB log4j.appender.LOGFILE.MaxBackupIndex=10
... View more
09-14-2018
05:57 PM
@desind, No limit that I know of on the CM side. Please start a new thread and provide your LDAP configuration, what happens in the logs and also the "abc_efg_scd_dfc" user LDIF entry. There are lots of reasons for failures, so it is important we start with what you observe and the items involved.
... View more
07-27-2018
10:55 AM
I am seeing similar issue with ServiceMonitor and Host monitor when using Redhat 6.8 (Santiago) CM/CDH is 5.11.1 After adding JAVA_TOOL_OPTIONS=-Xss2m to hostmonitor and service monitor configuration is works fine. Is this a known issue with Redhat 6.7 as well ? (The link you mentioned is centos and its 6.9)
... View more
07-18-2018
01:26 AM
Ok I understand your point but what if mappers are failing ? Yarn already sets up as many mappers as files number, should I increase this more ? Since only a minority of my jobs are failing, how can I tune yarn to use more mappers for these particular jobs?
... View more
- « Previous
-
- 1
- 2
- Next »