About snm1523

snm1523 · ‎09-09-2025

Hello, I have noticed in one of the charts (HDFS IO) available in CM dashboard for HDFS that runs below query have got values pulled from 2 separate entities for the same metric. Attached are the 3 screenshots for reference. 2 graphs showing values from individual entities and 1 the combined graph available in CM that triggers below query. select total_bytes_read_rate_across_datanodes, total_bytes_written_rate_across_datanodes where category = SERVICE and serviceType = HDFS and clusterId = "1" Two different entities are hdfs and hdfs:<cluster_name> I have also verified the response from below API call, and it does returns both entities and both have latest data. http://<cm-host>:7180/api/v19/timeseries?query=SELECT%20total_bytes_written_rate_across_datanodes%20WHERE%20category=SERVICE What do they mean? Where are they coming from? How to get it fixed so a single value is returned for the metric? Kindly advise. Thanks Snm1523

snm1523 · ‎02-26-2025

Thank you @upadhyayk04. This helped.

snm1523 · ‎02-25-2025

Hi, I am trying to prepare a report listing Client IP addresses connecting to various Cloudera services. I know in Ranger Admin UI -> Audits -> Access, I can see this information. However, because of the huge list of records, it is not practical to copy each page from here. Hence, below are options i can think of to fetch the data: 1. Ranger database: I am unsure which table in ranger DB stores this information from where Ranger Admin UI gets the data. 2. Solr: For our Ranger instance, source is configured as Solr. However, for some reason I am unable to query Solr via Solr UI. Possibly it could be coz of some permissions issues, as when attempted to get this data using Solr API, i get authentication error. Unsure where should I check for the permissions as in Solr Policy in Ranger, I have given all the reasonable permissions that I could think of to myself. Still no luck. Please suggest Thanks snm1523

snm1523 · ‎02-03-2025

For the benefit of anyone if looking for a solution to this. We need to add the -javaagent parameter in "Additional Broker Java Options" configuration property instead of what mentioned in in the post.

snm1523 · ‎01-29-2025

Hello, I have got a JXM exporter configured for Kafka brokers to send metrics to prometheus which allows me to monitor Kafka via Grafana Dashboards. All is working okay, however, stuck with a weird issue. Backaground: JMX exporter is configured using below property for Kafka brokers: in Kafka Broker Environment Advanced Configuration Snippet (Safety Valve) have added a key KAFKA_JMX_OPTS and its value as -javaagent:/var/lib/prometheus_jmx_config/jmx_prometheus_javaagent-0.20.0.jar=9091:/var/lib/prometheus_jmx_config/kafka_jmx_exporter.yml. The jar and yml files provided in the configuration are in place and all works okay. We are able to get required metrics to Grafana via Prometheus. Issue: Whenever we attempt to restart Kafka brokers, it does happily, however, when attempted a rolling restart it fails with below error in stderr log file: + [[ healthy partitions stay healthy == *\p\a\r\t\i\t\i\o\n\s\ \s\t\a\y\ \h\e\a\l\t\h\y* ]] + call_kafka_topics --at-min-isr-partitions + option=--at-min-isr-partitions + get_property bootstrap.servers /var/run/cloudera-scm-agent/process/18315-kafka-KAFKA_BROKER-kafka_broker_rolling_restart_pre_check/rolling_restart_check_before_stop_admin_client_configs.properties BOOTSTRAP_SERVERS ++ dirname /opt/cloudera/cm-agent/service/common/cloudera-config.sh + GET_PROPERTY_PY_DIR=/opt/cloudera/cm-agent/service/common ++ python -u /opt/cloudera/cm-agent/service/common/get_property.py bootstrap.servers /var/run/cloudera-scm-agent/process/18315-kafka-KAFKA_BROKER-kafka_broker_rolling_restart_pre_check/rolling_restart_check_before_stop_admin_client_configs.properties + value= + eval 'BOOTSTRAP_SERVERS='\'''\''' ++ BOOTSTRAP_SERVERS= + [[ -z '' ]] + BOOTSTRAP_SERVERS=<broker>:6667 + kafka-topics --at-min-isr-partitions --describe --bootstrap-server <broker>:6667 --command-config /var/run/cloudera-scm-agent/process/18315-kafka-KAFKA_BROKER-kafka_broker_rolling_restart_pre_check/rolling_restart_check_before_stop_admin_client_configs.properties Exception in thread "main" java.lang.reflect.InvocationTargetException at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndStartAgent(InstrumentationImpl.java:513) at java.instrument/sun.instrument.InstrumentationImpl.loadClassAndCallPremain(InstrumentationImpl.java:525) Caused by: java.net.BindException: Address already in use at java.base/sun.nio.ch.Net.bind0(Native Method) at java.base/sun.nio.ch.Net.bind(Net.java:459) at java.base/sun.nio.ch.Net.bind(Net.java:448) at java.base/sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:227) at java.base/sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:80) at jdk.httpserver/sun.net.httpserver.ServerImpl.<init>(ServerImpl.java:142) at jdk.httpserver/sun.net.httpserver.HttpServerImpl.<init>(HttpServerImpl.java:50) at jdk.httpserver/sun.net.httpserver.DefaultHttpServerProvider.createHttpServer(DefaultHttpServerProvider.java:35) at jdk.httpserver/com.sun.net.httpserver.HttpServer.create(HttpServer.java:137) at io.prometheus.jmx.shaded.io.prometheus.client.exporter.HTTPServer$Builder.build(HTTPServer.java:365) at io.prometheus.jmx.common.http.HTTPServerFactory.createHTTPServer(HTTPServerFactory.java:123) at io.prometheus.jmx.JavaAgent.premain(JavaAgent.java:60) ... 6 more *** java.lang.instrument ASSERTION FAILED ***: "result" with message agent load/premain call failed at ./src/java.instrument/share/native/libinstrument/JPLISAgent.c line: 422 /var/run/cloudera-scm-agent/process/18315-kafka-KAFKA_BROKER-kafka_broker_rolling_restart_pre_check/scripts/broker_rolling_restart_checker.sh: line 63: 24738 Aborted kafka-topics ${option} --describe --bootstrap-server "${BOOTSTRAP_SERVERS}" --command-config "${prop_file}" >> "${describe_partitions_output}" As per my understanding, the error indicates that a port is already being used while doing the health checks of Kafka topics. I assume it is complaining about 6667 since there is not clear mention of which port is an issue. However, rolling restart without this configuration just works fine. So confused if it is really the failure of Kafka-topics health check? Diagnosis / troubleshooting / findings: When we do a full restart of kafka brokers with this configuration in place, it works okay, only the rolling restart fails. We tried below Cloudera articles, however, no luck: https://my.cloudera.com/knowledge/How-to-configure-JMX-access-for-Kafka-in-CDP?id=390204 Since we are on a non-SSL cluster, below was the configuration tried while following the first link: -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M -XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80 -XX:+DisableExplicitGC -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.host=127.0.0.1 -Djava.rmi.server.hostname=127.0.0.1 -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false https://docs.cloudera.com/cdp-private-cloud-base/7.1.9/kafka-configuring/topics/kafka-config-rolling-restart-client-conf.html Since we are on a non-SSL cluster, below was the configuration tried while following the second link for testing: bootstrap.servers=<broker-1>:6667,<broker-2>:6667,<broker-3>:6667,<broker-4>:6667 security.protocol=SASL_PLAINTEXT ssl.client.auth=none sasl.mechanism=GSSAPI sasl.kerberos.service.name=kafka sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="/var/run/cloudera-scm-agent/process/18315-kafka-KAFKA_BROKER-kafka_broker_rolling_restart_pre_check/kafka.keytab" principal="<KAFKA Principal>"; Keytab location is hard coded just for 1 broker for testing and ensured that the keytab file exists at the location. We tried rolling restarting this broker, however, it failed. Without the above JMX configuration in Advanced configuration snippet, rolling restart just works fine. Unsure on what exactly have been configured incorrectly or what is missed. Request to please share your thoughts / recommendations / suggestions on the same. Thanks snm1523

snm1523 · ‎10-23-2024

Thank you for the help, Shubham. Thanks Snm1523

snm1523 · ‎10-21-2024

Hello, Trying to configure Datanode balancer network bandwidth to 4 GB since we have a good 25 GBPS network dedicated to the cluster. However, when set via Cloudera Manager, I get an alert stating max allowed in 1 GB. Is it something hard coded and cannot be increased beyond 1 GB via CM? OR there is a different way to do that? Thanks snm1523

snm1523 · ‎08-13-2024

So do you mean that there is no need of migration, just bring up the new server, assign required roles and then decomm the old one?

snm1523 · ‎08-13-2024

Thank you for the response. But its my old post and I was able to get Yarn QM working then.

snm1523 · ‎08-07-2024

Got it @AyazHussain. I was unclear with the statement "namespace updated for NN for RM". In our cluster we already have namespaces updated and also apps reach namespace instead of to NN directly. So that is okay. Lastly, would you be able to comment on how and what precautions are needed while moving below roles from one server to another? Target is to decommission old server. Atlas Server HBase REST Server HBase Thrift Server HDFS Balancer HDFS HttpFS Hive on Tez HiveServer2 Hue Server Hue Kerberos Ticket Renewer Impala Daemon Livy Server As per my understanding, we will need to just add new hosts and assign them the relevant roles, however, few of these might also need data migration. Any comments on that? Thanks Snm1523

Online	Offline
Last Visited	‎11-07-2025 08:17 AM

Member Since	‎10-29-2015 07:36 PM
Last Visited	‎11-07-2025 08:17 AM
Posts	128
Kudos received	31

Cloudera Community

Re: YARN and HDFS monitoring via Grafana

Re: Enable Admin account for Cloudera Manager

Re: Datanode not starting: SIGTERM error

Re: MKDirs failed to create file

What does entityName mean in CM API response?

Re: Ranger plugin / service access report - CDP 7....

Ranger plugin / service access report - CDP 7.1.9 ...

Re: Configure JMX Exporter (java agent) for Kafka ...

Configure JMX Exporter (java agent) for Kafka - CD...

Re: Datanode Balancer bandwidth configuration

Datanode Balancer bandwidth configuration

Re: Add / Remove servers in cluster - CDP PB

Re: Unable to start QueueManager WebApp

Re: Add / Remove servers in cluster - CDP PB