Member since
03-14-2016
4721
Posts
1111
Kudos Received
874
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2822 | 04-27-2020 03:48 AM | |
| 5474 | 04-26-2020 06:18 PM | |
| 4642 | 04-26-2020 06:05 PM | |
| 3697 | 04-13-2020 08:53 PM | |
| 5600 | 03-31-2020 02:10 AM |
08-19-2018
10:40 AM
1 Kudo
Sometimes it is desired to have the logs rotated as well as compressed. We can use log4j extras in order to achieve the same. For processes like NameNode / DataNode...etc we can use the approach described in the article. https://community.hortonworks.com/articles/50058/using-log4j-extras-how-to-rotate-as-well-as-zip-th.html However when we try to use the same approach in Ambari 2.6 for ambari metrics collector log compression and rotation then it will not work and we might see some warnings / errors inside the "" something like following: log4j:WARN Failed to set property [triggeringPolicy] to value "org.apache.log4j.rolling.SizeBasedTriggeringPolicy".
log4j:WARN Failed to set property [rollingPolicy] to value "org.apache.log4j.rolling.FixedWindowRollingPolicy".
log4j:WARN Please set a rolling policy for the RollingFileAppender named 'file'
log4j:ERROR No output stream or file set for the appender named [file].
(OR)
log4j:ERROR A "org.apache.log4j.rolling.SizeBasedTriggeringPolicy" object is not
assignable to a "org.apache.log4j.rolling.RollingPolicy" variable.
log4j:ERROR The class "org.apache.log4j.rolling.RollingPolicy" was loaded by
log4j:ERROR [sun.misc.Launcher$AppClassLoader@2328c243] whereas object of type
log4j:ERROR "org.apache.log4j.rolling.SizeBasedTriggeringPolicy" was loaded by [sun.misc.Launcher$AppClassLoader@2328c243]. . This is because we see that there is a b ug reported as https://bz.apache.org/bugzilla/show_bug.cgi?id=36384. which says that in some older version of log4j these rolling policies were not configurable via log4j.properties (those were only configurable via log4j.xml) This bug added a feature in log4j to achieve "Configuring triggering/rolling policies should be supported through properties" hence you will need to make sure that you are using the log4j JAR of version "log4j-1.2.17.jar" (instead of using the "log4j-1.2.15.jar") Hence if users wants to use the rotation and zipping feature of log4j then make sure that your AMS collector is not using old version of log4j. This article just describes a workaround hence follow this suggestion at your own risk because here we are going to change the default log4j jar shipped with AMS collector lib. # mv /usr/lib/ambari-metrics-collector/log4j-1.2.15.jar /tmp/
# cp -f /usr/lib/ams-hbase/lib/log4j-1.2.17.jar /usr/lib/ambari-metrics-collector/ . Now also make sure to copy the "log4j-extras-1.2.17.jar" on the ambari metrics collector host which provides the various log rotation policies. # mkdir /tmp/log4j_extras
# curl http://apache.mirrors.tds.net/logging/log4j/extras/1.2.17/apache-log4j-extras-1.2.17-bin.zip -o /tmp/log4j_extras/apache-log4j-extras-1.2.17-bin.zip
# cd /tmp/log4j_extras
# unzip apache-log4j-extras-1.2.17-bin.zip
# cp -f /tmp/log4j_extras/apache-log4j-extras-1.2.17/apache-log4j-extras-1.2.17.jar /usr/lib/ambari-metrics-collector/ . Users need to also edit the "ams-log4j" via ambari to add the customized appender. Ambari UI --> Ambari Metrics --> Configs --> Advanced --> "Advanced ams-log4j" --> ams-log4j template (text area) OLD default Value (please comment out the following) # Direct log messages to a log file
#log4j.appender.file=org.apache.log4j.RollingFileAppender
#log4j.appender.file.File=${ams.log.dir}/${ams.log.file}
#log4j.appender.file.MaxFileSize={{ams_log_max_backup_size}}MB
#log4j.appender.file.MaxBackupIndex={{ams_log_number_of_backup_files}}
#log4j.appender.file.layout=org.apache.log4j.PatternLayout
#log4j.appender.file.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n . New Appender Config log4j.appender.file=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.file.rollingPolicy=org.apache.log4j.rolling.FixedWindowRollingPolicy
log4j.appender.file.rollingPolicy.maxIndex={{ams_log_number_of_backup_files}}
log4j.appender.file.rollingPolicy.ActiveFileName=${ams.log.dir}/${ams.log.file}
log4j.appender.file.rollingPolicy.FileNamePattern=${ams.log.dir}/${ams.log.file}-%i.gz
log4j.appender.file.triggeringPolicy=org.apache.log4j.rolling.SizeBasedTriggeringPolicy
log4j.appender.file.triggeringPolicy.MaxFileSize=10240000
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n . Notice: Here for testing we are hard coding the value for property "log4j.appender.file.triggeringPolicy.MaxFileSize" to something like "10240000" (around 10MB) because triggering policy does not accept values in KB/MB (like 10KB / 10MB) format hence we are putting the values in Bytes. Users can have their own value defined there. After that once we restart the AMS collector service then we should be able to see the ambari metrics collector log rotation as following: # cd /var/log/ambari-metrics-collector/
# ls -larth ambari-metrics-collector.lo*
-rw-r--r--. 1 ams hadoop 453K Aug 19 10:16 ambari-metrics-collector.log-4.gz
-rw-r--r--. 1 ams hadoop 354K Aug 19 10:17 ambari-metrics-collector.log-3.gz
-rw-r--r--. 1 ams hadoop 458K Aug 19 10:20 ambari-metrics-collector.log-2.gz
-rw-r--r--. 1 ams hadoop 497K Aug 19 10:22 ambari-metrics-collector.log-1.gz
-rw-r--r--. 1 ams hadoop 9.1M Aug 19 10:25 ambari-metrics-collector.log .
... View more
Labels:
05-31-2018
11:36 AM
There was a slight typo in the above article which was identified and fixed as part of feedback provided by HCC user on thread: https://community.hortonworks.com/questions/194177/kafka-best-practices-kafka-jvm-performance-opts.html
... View more
03-21-2018
11:15 PM
@Dharmesh Jain Thanks for sharing the findings. It was really a very keen observation. Wonderful !!! I have incorporated the changes as you suggested above.
... View more
02-26-2018
07:49 AM
3 Kudos
In this article we will see how to add sample users and enable password based authentication for the Zeppelin UI. By default when we access Zeppelin we see that we are able to access it as "anonymous" user. (means users are not challenged to provide credentials) . 1. Login to Ambari UI and then navigate to : Ambari UI --> Zeppelin Notebook --> Configs --> Advanced --> Advanced zeppelin-shiro-ini 2. Then add the users inside the "[users]" section as following: [users]
# List of users with their password allowed to access Zeppelin.
# To use a different strategy (LDAP / Database / ...) check the shiro doc at http://shiro.apache.org/configuration.html
#Configuration-INISections
admin = admin
user1=user1pwd
user2=user2pwd 3. Also edit the "[urls]" section and add "authBasic" as following to tell which all URL patterns needs to be protected: [urls]
# anon means the access is anonymous.
# authcBasic means Basic Auth Security
# To enfore security, comment the line below and uncomment the next one /api/version = anon
#/** = anon
/** = authcBasic 4. Restart Zeppelin and then users can try accessing the Zeppelin Notebook UI and you will see that it presents a Basic Authentication window to enter username & password. http://$ZEPPELIN_HOST:9995/#/ . NOTE: Zeppelin can also be configured to leverage an organization's Active Directory infrastructure for user authentication. By doing this, the existing Active Directory users can login to Zeppelin UI using their Active Directory credentials. In order to enable Active Directory based authentication for
Zeppelin then you can refer to the following article:
https://community.hortonworks.com/articles/70392/how-to-configure-zeppelin-for-active-directory-use.html
... View more
Labels:
09-08-2017
02:32 AM
@Hajime I just tried downloading the "4811-custom-alerts.zip" and i can extract it properly. How are you downloading it. $ md5 4811-custom-alerts.zip
MD5 (4811-custom-alerts.zip) = a33f105860d07e05149b68960a0ea0c9 .
... View more
09-02-2017
07:19 PM
22 Kudos
Ambari is the heart of any HDP cluster. It provides us the feature of provisioning, managing, monitoring and securing Hadoop / HDP clusters. It's is a Java program which interacts with Database to read the cluster details and runs on embedded jetty server. Many times we find issues with Ambari server performance.
It's Ambari UI operations sometimes responds slowly or the startup might take longer time if it is not properly tuned. So in order to troubleshoot the ambari server performance related issues we should look at some of the data/stats and tuning parameters to make the ambari server perform better. In this article we will talk about some very basic tuning parameters and performance related troubleshooting.
What all information's needed?
When we notice that the ambari server is responding slow then we should look first the following details first:
1). The number of hosts added to the ambari cluster. So that accordingly we can tune the ambari agent thread pools.
2). The number of concurrent users (or the view users) who access the ambari server at a time. So that accordingly we can tune the ambari thread pools.
3). The age of the ambari cluster. If the ambari server is too old then the possibility is that some of the operational logs and the alert histories will be consuming a large amount of the Database which might be causing ambari DB queries to respond slow.
4). The Ambari Database health and it's geographic location from the ambari server, to isolate if there are any network delays.
5). Ambari server memory related tuning parameters to see if the ambari heap is set correctly.
6). For ambari UI slowness we should check the network proxy issues to see if there are any network proxies added between client the ambari server machine Or the network slowness.
7). If the ambari users are synced with the AD or external LDAP and if the communication between server and the AD/LDAP is good.
8). Also the resource availability on the ambari host like the available free memory and if any other service/component running on ambari server is consuming more Memory/CPU/IO.
.
How to Troubleshoot?
Usually we start with checking the ambari server memory settings, host level resource availability (Like: Memory/CPU/IO) and the thread dumps to see where the threads are stuck or taking long time to execute certain api/database calls.
.
Check-1). We will check the ambari-server log to see if there are any repeated warning or error messages.
.
Check-2). First we should check if the ambari-server host has enough free memory and CPU available, Also the list of open files (to see if there are any leaking), netstat output to find out if there are any CLOSE_WAIT or TIME_WAIT sockets. That we can check by running the following commands on the ambari server host.
Example:
# free -m
# top
# lsof -p $AMBARI_PID
# netstat -tnlpa | grep $AMBARI_PID
.
Check-3). If we see that enough free memory and CPU cycles are available then we can check if the thread dump shows us any stuck/blocked threads or the activities of the threads are normal ?
In order to do that we can collect ambari-server thread dumps. We can refer to the following article to know how to colect the ambari server thread dumps. We can use the "$JAVA_HOME/bin/jcmd" or "$JAVA_HOME/bin/jstack" kind of jvm utilities to do so.
https://community.hortonworks.com/articles/72319/how-to-collect-threaddump-using-jcmd-and-analyse-i.html
It is always recommended to collect at least 5-6 thread dumps in some interval like 10 seconds after each thread dump. This gives us a detailed idea about the thread activities during a period of time. The thread dump should be collected when we see the slow response from the ambari server else the thread dumps will show normal behavior.
.
Check-4). Sometimes we may encounter OutOfMemoryError in ambari-server log as following which indicates that ambari server Heap size is not tuned properly or it needs to be increased a bit more:
Exception in thread "qtp-ambari-agent-91" java.lang.OutOfMemoryError: Java heap space
There are some recommendations available for ambari server heap tuning based on the cluster size as part of the doc that can be used for heap tuning: https://docs.hortonworks.com/HDPDocuments/Ambari-2.6.0.0/bk_ambari-administration/content/ch_tuning_ambari_performance.html
.
We should also check the current memory utilization statistics of the ambari server. We can use the JVM utility "jmap" for the same.
Example:
/usr/jdk64/jdk1.8.0_112/bin/jmap -heap $AMBARI_SERVER_PID
Output:
# /usr/jdk64/jdk1.8.0_112/bin/jmap -heap `cat /var/run/ambari-server/ambari-server.pid`
Attaching to process ID 673, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.112-b15
using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC
Heap Configuration:
MinHeapFreeRatio = 40
MaxHeapFreeRatio = 70
MaxHeapSize = 2147483648 (2048.0MB)
NewSize = 134217728 (128.0MB)
MaxNewSize = 536870912 (512.0MB)
OldSize = 402653184 (384.0MB)
NewRatio = 3
SurvivorRatio = 8
MetaspaceSize = 21807104 (20.796875MB)
CompressedClassSpaceSize = 1073741824 (1024.0MB)
MaxMetaspaceSize = 17592186044415 MB
G1HeapRegionSize = 0 (0.0MB)
Heap Usage:
New Generation (Eden + 1 Survivor Space):
capacity = 120848384 (115.25MB)
used = 78420056 (74.78719329833984MB)
free = 42428328 (40.462806701660156MB)
64.89127401157471% used
Eden Space:
capacity = 107479040 (102.5MB)
used = 72431960 (69.07649993896484MB)
free = 35047080 (33.423500061035156MB)
67.39170725752668% used
From Space:
capacity = 13369344 (12.75MB)
used = 5988096 (5.710693359375MB)
free = 7381248 (7.039306640625MB)
44.7897518382353% used
To Space:
capacity = 13369344 (12.75MB)
used = 0 (0.0MB)
free = 13369344 (12.75MB)
0.0% used
concurrent mark-sweep generation:
capacity = 402653184 (384.0MB)
used = 87617376 (83.55844116210938MB)
free = 315035808 (300.4415588378906MB)
21.760010719299316% used
37359 interned Strings occupying 3641736 bytes.
.
If the used heap usage is high and reaching the max heap then we can try increase the amabri-server memory by editing the "/var/lib/ambari-server/ambari-env.sh" file and increasing the heap memory (-Xmx4g) inside the property "AMBARI_JVM_ARGS" something as following:
# grep 'AMBARI_JVM_ARGS' /var/lib/ambari-server/ambari-env.sh
export AMBARI_JVM_ARGS=$AMBARI_JVM_ARGS' -Xms4g -Xmx4g -XX:MaxPermSize=128m -Djava.security.auth.login.config=$ROOT/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false'
.
Check-5). If we want to monitor heap and garbage collection details over a period of time then we can also enable the Garbage Collection logging for the ambari server by adding the GC log option in ambari "ambari-env.sh" file as following:
# grep 'AMBARI_JVM_ARGS' /var/lib/ambari-server/ambari-env.sh
export AMBARI_JVM_ARGS=$AMBARI_JVM_ARGS' -Xms512m -Xmx2048m -XX:MaxPermSize=128m -Djava.security.auth.login.config=$ROOT/etc/ambari-server/conf/krb5JAASLogin.conf -Djava.security.krb5.conf=/etc/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false -Xloggc:/var/log/ambari-server/ambari-server_gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps'
.
.
Ambari JVM/Database Monitoring using Grafana
Check-6). From Ambari 2.5 onward, We can also check the ambari performance statistics related to ambari jvm and database. For more information on this please refer to: https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.1.0/bk_ambari-operations/content/grafana_ambari_component_dashboards.html
http://$GRAFANA_HOST:3000/dashboard/db/ambari-server-jvm
http://$GRAFANA_HOST:3000/dashboard/db/ambari-server-database
.
If the ambari server metrics are not enabled then we can enable it. To enable Ambari Server metrics, make sure the following config file exists during Ambari Server start/restart - "/etc/ambari-server/conf/metrics.properties".
Currently, only 2 metric sources have been implemented - JVM Metric Source and Database Metric Source. To add / remove a metric source to be tracked the following config needs to be modified in the metrics.properties file.
metric.sources=jvm,database
Example:
# grep 'metric.sources' /etc/ambari-server/conf/metrics.properties
metric.sources=jvm,database
.
NOTE: Please do not forget to add the following line inside the "ambari.properties" file.
# grep 'profiler' /etc/ambari-server/conf/ambari.properties
server.persistence.properties.eclipselink.profiler=org.apache.ambari.server.metrics.system.impl.AmbariPerformanceMonitor
.
.
Ambari Thread Pool Tuning
Check-7). If the cluster size is large then we should also tune the "agent.threadpool.size.max" property inside the "/etc/ambari-server/conf/ambari.properties" file.
"agent.threadpool.size.max" : property sets max number of threads used to process heartbeats from ambari agents. The default value for this property is "25". This basically indicates the size of the Jetty connection pool used for handling incoming Ambari Agent requests.
# grep 'agent.threadpool.size.max' /etc/ambari-server/conf/ambari.properties
50
.
.
Check-8). If inside our ambari server we have some views (like Hive/File View ..etc) which is accessed by many concurrent users Or if there are many users access the ambari UI concurrently or makes Ambari Rest API calls. Then in such cases we should also increase the "client.threadpool.size.max" property value (default values is 25) inside the "/etc/ambari-server/conf/ambari.properties".
"client.threadpool.size.max" : The size of the Jetty connection pool used for handling incoming REST API requests. This should be large enough to handle requests from both web browsers and embedded Views.
# grep 'client.threadpool.size.max' /etc/ambari-server/conf/ambari.properties
100
If the client thread pool size is not set properly then while accessing ambari UI or making Ambari API calls we might see the following kind of response:
{
status: 503,
message: "There are no available threads to handle view requests"
}
.
.
Ambari Connection Pool Tuning.
Check-9). We can also add the following properties to adjust the JDBC connection pool settings for large clusters like above 100 nodes or based on need:
server.jdbc.connection-pool.acquisition-size=5
server.jdbc.connection-pool.max-age=0
server.jdbc.connection-pool.max-idle-time=14400
server.jdbc.connection-pool.max-idle-time-excess=0
server.jdbc.connection-pool.idle-test-interval=7200
- If using MySQL as the Ambari database, in your MSQL configuration, increase the wait_timeout and interacitve_timeout to 8 hours (28800) and max. connections from 32 to 128.
- It is critical that the Ambari configuration for "server.jdbc.connection-pool.max-idle-time" and "server.jdbc.connection-pool.idle-test-interval" must be lower than the MySQL "wait_timeout" and "interactive_timeout" set on the MySQL side. If you choose to decrease these timeout values, adjust down "server.jdbc.connection-pool.max-idle-time" and "server.jdbc.connection-pool.idle-test-interval" accordingly in the Ambari configuration so that they are less than "wait_timeout" and interactive_timeout.
.
.
Ambari Cache Tuning
Check-10). If the cluster size if more than 200 nodes then tuning the Cache will helps sometimes. For that we Calculate the new, larger cache size, using the following relationship, where <cluster_size> is the number of nodes in the cluster.
Following how to approximate value is calculated.
ecCacheSizeValue=60*<cluster_size>
Following part says how to apply that property
On the Ambari Server host, in /etc/ambari-server/conf/ambari-properties, add the following property and value. If the cluster has 500 nodes then we can set it to:
server.ecCacheSize=30000
.
.
Ambari Alert Related Tuning
Check-11). Setting "alerts.cache.enabled" , If the value for this property is set to "true", then alerts processed by the "AlertReceivedListener" will not write alert data to the database on every event. Instead, data like timestamps and text will be kept in a cache and flushed out periodically to the database. The default value is "false". Alert caching was experimental around ambari 2.2.2 version.
We can enable the Alerts cache and then monitor it for few days to see it's effect. We will need to add this parameter to "/etc/ambari-server/conf/ambari.properties". Some other properties related to alert caching & alert execution scheduler are as following.
Example:
alerts.cache.enabled=true
alerts.cache.size=100000
alerts.execution.scheduler.threadpool.size.core=4
alerts.execution.scheduler.threadpool.size.max=8
The "alerts.cache.size" defines the size of the alert cache which is by default set to "50000" when the alerts.cache.enabled.
"alerts.execution.scheduler.threadpool.size.core" defines the core number of threads used to process incoming alert events. The value should be increased as the size of the cluster increases.
"alerts.execution.scheduler.threadpool.size.max" defines the maximum number of threads which will handle published alert events. Default value is "2".
.
.
Ambari API Response Time Check
Check-12). During slowness of ambari we can try running the following curl call (which tries to fetch the cluster details) to see how much time does it take to get the cluster details. It gives us some idea if the cluster json response is taking some time or if it is too large or has lots of .
# time curl -i -u admin:admin -H 'X-Requested-By: ambari' -X GET http://amb25101.example.com:8080/api/v1/clusters/plain_cluster
real 0m20.234s
user 0m0.009s
sys 0m0.017s
# time curl -i -u admin:admin -H 'X-Requested-By: ambari' -X GET http://amb25101.example.com:8080/api/v1/clusters/plain_cluster?fields=Clusters/desired_configs
# time curl -i -u admin:admin -H 'X-Requested-By: ambari' -X GET http://amb25101.example.com:8080/api/v1/clusters/plain_cluster?fields=Clusters/health_report,Clusters/total_hosts,alerts_summary_hosts
"user" means userspace, so the number of CPU seconds spent doing work in the JVM code. User is the amount of CPU time spent in user-mode code (outside the kernel) within the process. This is only actual CPU time used in executing the process. Other processes and time the process spends blocked do not count towards this figure.
"sys" means kernel-space, so the number of cpu-seconds spent doing work in the kernel. Sys is the amount of CPU time spent in the kernel within the process. This means executing CPU time spent in system calls within the kernel, as opposed to library code, which is still running in user-space. Like 'user', this is only CPU time used by the process.
"real" means "wall lock" time. This is all elapsed time including time slices used by other processes and time the process spends blocked (for example if it is waiting for I/O to complete).
Example: For example ["user=3.00 sys=0.05 real=1.00"] means there was
>>> 50ms of kernel work,
>>> 3s of jvm work and
>>> overall it took 1 second
.
.
Ambari Database Query Logging
Check-13). In some cases it is useful to enable the Database Query Logging to find out how the queries are getting executed and how many times which query is getting executed.
We can enable the "server.jdbc.properties.loglevel=2" property inside the "/etc/ambari-server/conf/ambari.properties" file and restart the ambari server which will start writing the JDBC queries to the "/var/log/ambari-server/ambari-server.out" file.
# grep 'server.jdbc.properties.loglevel' /etc/ambari-server/conf/ambari.properties
server.jdbc.properties.loglevel=2
.
Example output of logged queries from ambari-server.out
# grep 'SELECT alert_' ambari-server.out
16:17:19.432 (3) FE=> Parse(stmt=null,query="SELECT alert_id, alert_definition_id, alert_instance, alert_label, alert_state, alert_text, alert_timestamp, cluster_id, component_name, host_name, service_name FROM alert_history WHERE (alert_id = $1)",oids={20})
16:17:19.439 (6) FE=> Parse(stmt=null,query="SELECT alert_id, alert_definition_id, alert_instance, alert_label, alert_state, alert_text, alert_timestamp, cluster_id, component_name, host_name, service_name FROM alert_history WHERE (alert_id = $1)",oids={20})
16:26:38.424 (3) FE=> Parse(stmt=null,query="SELECT t1.alert_id AS a1, t1.definition_id AS a2, t1.firmness AS a3, t1.history_id AS a4, t1.latest_text AS a5, t1.latest_timestamp AS a6, t1.maintenance_state AS a7, t1.occurrences AS a8, t1.original_timestamp AS a9 FROM alert_history t0, alert_definition t2, alert_current t1 WHERE ((((t0.cluster_id = $1) AND (t2.definition_name = $2)) AND (t0.host_name = $3)) AND ((t0.alert_id = t1.history_id) AND (t2.definition_id = t0.alert_definition_id))) LIMIT $4 OFFSET $5",oids={20,1043,1043,23,23})
.
.
Ambari Database Query/Performance Monitor
Check-14). In some cases it is also useful to enable "QueryMonitor" and "PerformanceMonitor" statistics. The "QueryMonitor" is used to measure query executions and cache hits. This can be useful for performance analysis in a complex system. The batch writing, this value is the number of statements to batch (default: 100)
Instead of "QueryMonitor" We also use native EclipseLink "PerformanceMonitor" to count how many queries are actually hitting the DB. The performance monitor and query monitor can be enabled in ambari through "/etc/ambari-server/conf/ambari.properties" using the below property:
Example:
server.persistence.properties.eclipselink.profiler=PerformanceMonitor
server.persistence.properties.eclipselink.jdbc.batch-writing.size=25
server.persistence.properties.eclipselink.profiler=QueryMonitor
In order to know more about how to use them properly, we can refer to the following article: https://community.hortonworks.com/articles/73269/how-to-analyze-the-ambari-servers-db-activity-perf.html
.
.
Ambari Database Cleanup / Purge
Check-15). In some old clusters we see that there are lots of old "alert_history" or old alert notification data entries present in the database that causes slowness, As with time these entries grows much on the database. So the DB dump size also grows and the DB queries can respond slow results. We can use the following command to perform some DB cleanup.
# ambari-server db-cleanup -d 2016-09-30 --cluster-name=MyCluster
For more details on this refer to: https://community.hortonworks.com/articles/134958/ambari-database-cleanup-speed-up.html
https://issues.apache.org/jira/browse/AMBARI-20687
The db-cleanup works well from ambari 2.5.0/2.5.1 (ambari 2.4 there were some issues reported).
.
From Ambari 2.5.2 Onwards: From Ambari 2.5.2 onwards the name of this operation will be changed to "db-purge-history" and apart from the Alert related tables it should also consider of other tables lie host_role_command and execution_commands and if there is any other tables as well.
# ambari-server db-purge-history --cluster-name Prod --from-date 2017-08-01
See: https://docs.hortonworks.com/HDPDocuments/Ambari-2.5.2.0/bk_ambari-administration/content/purging-ambari-server-history.html
The "db-purge-history" command will analyze the following tables in the Ambari Server database and remove those rows that can be deleted that have a create date after the --from-date specified when the command is run.
.
AlertCurrent
AlertNotice
ExecutionCommand
HostRoleCommand
Request
RequestOperationLevel
RequestResourceFilter
RoleSuccessCriteria
Stage
TopologyHostRequest
TopologyHostTask
TopologyLogicalTask
.
.
... View more
Labels:
08-29-2017
09:29 AM
1 Kudo
Many times we see some repeated logging inside our log files. For example in case of ambari-server.log we see the following kind of repeated logging inside the log. WARNING: A HTTP GET method, public javax.ws.rs.core.Response org.apache.ambari.server.api.services.StacksService.getStackArtifacts(java.lang.String,javax.ws.rs.core.HttpHeaders,javax.ws.rs.core.UriInfo,java.lang.String,java.lang.String), should not consume any entity. We might see the above kind of warning messages repeated many times. # grep 'public javax.ws.rs.core.Response org.apache.ambari.server.api.services.RequestService.getRequests' /var/log/ambari-server/ambari-server.log
150
- These are actually harmless WARNING messages, but many times it is desired to make sure that they are not logged, That way we can save some disk space issues and have a clean log. - Every time it is not possible to change the rootLogger to "ERROR" like following to avoid printing some INFO/WARNING messages, Because it will cause suppressing other useful INFO/WARNING messages not t be logged. log4j.rootLogger=ERROR,file - In order to avoid logging of few specific log entries based on the Strings irrespective of the various different logging level (INFO/WARNING/ERROR/DEBUG) those entries are coming from. - In this case suppose, if we do not want to log the line which has "public javax.ws.rs.core.Response" entry in it at any place then we can make use of StringMatchFilter feature of log4j as following: . Step-1). Edit the "/etc/ambari-serevr/conf/log4j.properties" and add the following 3 lines in it Just below to the "file" log appender. log4j.appender.file.filter.01=org.apache.log4j.varia.StringMatchFilter
log4j.appender.file.filter.01.StringToMatch=public javax.ws.rs.core.Response
log4j.appender.file.filter.01.AcceptOnMatch=false Now the log4j.properties audit log appender will look like following: # Direct log messages to a log file
log4j.appender.file=org.apache.log4j.RollingFileAppender
log4j.appender.file.File=${ambari.log.dir}/${ambari.log.file}
log4j.appender.file.MaxFileSize=80MB
log4j.appender.file.MaxBackupIndex=60
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{DATE} %5p [%t] %c{1}:%L - %m%n
log4j.appender.file.filter.01=org.apache.log4j.varia.StringMatchFilter
log4j.appender.file.filter.01.StringToMatch=public javax.ws.rs.core.Response
log4j.appender.file.filter.01.AcceptOnMatch=false NOTE: we can use as many filters we want. We will only need to change the filter number like "log4j.appender.file.filter.01", "log4j.appender.file.filter.02", "log4j.appender.file.filter.03" with different "StringToMatch" values. Step-2). Move the OLD ambari-server logs and restart the ambari-server # mv /var/log/ambari-server /var/log/ambari-server_OLD
# ambari-server restart . Step-3). Put the ambari-server.log in tail and then restart ambari server to see if the following line entry is gone from the ambari-server.log now and you should not see those lines again. # grep 'public javax.ws.rs.core.Response org.apache.ambari.server.api.services.RequestService.getRequests' /var/log/ambari-server/ambari-server.log .
... View more
Labels:
08-28-2017
06:49 AM
@Rajesh Wonderful article. Just added code block for the commands.
... View more