Support Questions
Find answers, ask questions, and share your expertise

Changes made to zookeper-log4j through Ambari are not getting propagated

Contributor

I'm heaving issues with zookeeper being stopped a few seconds after startup. My problem though is that it seems the changes made to log4j properties through Ambari are not getting propagated. I'm trying to change rootLogger to DEBUG and change the log file name from default to zookeeper.log Here is the "Advanced zookeeper log4j":

# DEFAULT: console appender only

#log4j.rootLogger=INFO, CONSOLE, ROLLINGFILE

# Example with rolling log file

log4j.rootLogger=DEBUG, CONSOLE, ROLLINGFILE

# Example with rolling log file and tracing

#log4j.rootLogger=TRACE, CONSOLE, ROLLINGFILE, TRACEFILE

# # Log INFO level and above messages to the console

# log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender log4j.appender.CONSOLE.Threshold=INFO log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} - %-5p [%t:%C{1}@%L] - %m%n # # Add ROLLINGFILE to rootLogger to get log file output # Log DEBUG level and above messages to a log file log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender log4j.appender.ROLLINGFILE.Threshold=DEBUG log4j.appender.ROLLINGFILE.File={{zk_log_dir}}/zookeeper.log

# Max log file size of 10MB log4j.appender.ROLLINGFILE.MaxFileSize={{zookeeper_log_max_backup_size}}MB

# uncomment the next line to limit number of backup files

#log4j.appender.ROLLINGFILE.MaxBackupIndex={{zookeeper_log_number_of_backup_files}} log4j.appender.ROLLINGFILE.layout=org.apache.log4j.PatternLayout log4j.appender.ROLLINGFILE.layout.ConversionPattern=%d{ISO8601} - %-5p [%t:%C{1}@%L] - %m%n

# # Add TRACEFILE to rootLogger to get log file output # Log DEBUG level and above messages to a log file log4j.appender.TRACEFILE=org.apache.log4j.FileAppender log4j.appender.TRACEFILE.Threshold=TRACE log4j.appender.TRACEFILE.File=zookeeper_trace.log log4j.appender.TRACEFILE.layout=org.apache.log4j.PatternLayout

### Notice we are including log4j's NDC here (%x) log4j.appender.TRACEFILE.layout.ConversionPattern=%d{ISO8601} - %-5p [%t:%C{1}@%L][%x] - %m%n

But yet after restarting ps -ef | grpe zookeper shows the following:

zookeep+ 18598 1 4 23:49 ? 00:00:00 /usr/jdk64/jdk1.8.0_112/bin/java -Dzookeeper.log.dir=/xxx/zk/logs -Dzookeeper.log.file=zookeeper-zookeeper-server-xxxxxxxxxxx.log -Dzookeeper.root.logger=INFO,ROLLINGFILE .....

Note that neither DEBUG, nor new log file name (zookeeper.log) have not been propagated by ambari. Is there any property missing?

Thanks, Alex

20 REPLIES 20

Hello Alex Goron

Hm indeed, that's quite strange.
Is there any "Custom zookeeper-log4j" set? Or any config groups different than default used by zk?
Ps: I'm attaching my zookeer-log4j.txt so you can compare with yours, or even replace it 🙂
Hope this helps!

Contributor
@Vinicius Higa MurakamiThanks for helping me with that. Your file looks absolutely the same as mine - line to line. And no other config groups configured. I was hoping to get into debug mode to see why zookeeper is failing after startup on all but just one node. But not sure at this point if there is any other way to look at the logs and find the reason.

Thanks,

Alex

Hi there @Alex Goron!

Hm, did you look at the zookeeper-env.sh? Maybe there's something extra set.
Btw, inside the zookeeper log4f file on the ZK hosts can you see the changes applied for DEBUG?

Contributor

Hi @Vinicius Higa Murakami , I found log4j in 2 places. Since I'm relatively new to HDP can you confirm which one is is being read during the startup? The log4j.properties which is under /etc/zookeeper/2.6.2.0-205/0/ did reflect the changes. The one which is under /usr/hdp/2.6.2.0-205/etc/zookeeper/conf.dist/log4j.properties did not and the properties names look a bit different (with '.' in the names)

As for zookeeper-env.sh (the one that is under /etc/zookeeper/2.6.2.0-205/0/) has my current directory for the log listed which is also in line with Ambari "Advanced zookeeper-env" field .

So assuming the config directory is /etc/zookeeper/2.6.2.0-205/0/, the good news is that it picks up changes from Ambari, however it does not look like they are being read.

Also I noticed that it adds new log entries only to one host (the one that does not fail after startup), the other 2 hosts that stays after startup for 30 secs or so and then fail do not have any new log entries even if I start them separately.

Thanks,

Alex

Hello @Alex Goron!
Wow, that's weird 😞
Okay, so answering your question it should be under the /etc/zookeeper. What do you have on /usr/hdp/<VERSION>/zookeeper/ should point (symbolic link) to the /etc/zookeeper. Just like this:

[root@node3 kafka]# ls -ltrah /usr/hdf/current/zookeeper-server/
total 788K
-rw-r--r-- 1 root root 782K May 23 15:45 zookeeper-3.4.6.3.1.2.0-7.jar
drwxr-xr-x 2 root root  222 Jul 21 06:35 bin
drwxr-xr-x 5 root root   42 Jul 21 06:35 doc
drwxr-xr-x 3 root root   18 Jul 21 06:35 etc
drwxr-xr-x 3 root root   18 Jul 21 06:35 man
drwxr-xr-x 2 root root 4.0K Jul 21 06:35 lib
lrwxrwxrwx 1 root root   29 Jul 21 06:35 zookeeper.jar -> zookeeper-3.4.6.3.1.2.0-7.jar
lrwxrwxrwx 1 root root   26 Jul 21 06:35 conf -> /etc/zookeeper/3.1.2.0-7/0
drwxr-xr-x 7 root root  131 Jul 21 06:35 .
drwxr-xr-x 8 root root  107 Jul 21 06:45 ..

Just asking but.. do you think it's possible to have any other services that might be installed on those 2 hosts (failing) that relies on ZK and it's apart from Ambari? E.g. Kafka, Atlas, HA of other components.. and so on. If so, usually this services comes with ZK embedded to launch the standalone/quickstart. Otherwise, we can check the logs from ambari-agent to see if shows up something.
Ps: one last thing, could you check if you have any env variable set on those 2 hosts that starts with ZOO? I'm also putting my zookeeper-env.sh below, to be compared with yours 🙂


export JAVA_HOME={{java64_home}}
export ZOOKEEPER_HOME={{zk_home}}
export ZOO_LOG_DIR={{zk_log_dir}}
export ZOOPIDFILE={{zk_pid_file}}
export SERVER_JVMFLAGS={{zk_server_heapsize}}
export JAVA=$JAVA_HOME/bin/java
export CLASSPATH=$CLASSPATH:/usr/share/zookeeper/*


{% if security_enabled %}
export SERVER_JVMFLAGS="$SERVER_JVMFLAGS -Djava.security.auth.login.config={{zk_server_jaas_file}}"
export CLIENT_JVMFLAGS="$CLIENT_JVMFLAGS -Djava.security.auth.login.config={{zk_client_jaas_file}}"
{% endif %}
Hope this helps!

Contributor

Thanks @Vinicius Higa Murakami, the ZOO values are set in zookeeper-env.sh the same way as yours just with real paths and looks the same under the ambari config properties.

On one of the ZK failing node I have: Knox, Hmaster, Journal Node, Data Node, Node Manager, HDFS client and the stuff that I was forced to install by Ambari (Activity explorer, HST Server, Metrics Collector, Grafana, HST agent, Metrics Monitor)

On another ZK failing node: Journal Node, Zookeeper Failover Controller, Standby Name Node, Standby Resource Manager, Job History Server, Hive Server2, Hive Metastore, Data Node, Node Manager, Ambari Server, PostgreSQL, Spark/Spark2, Spark/Spark2 History server, Phoenix Query Server, dependencies (Metrics Monitor, HST Agent, App Timeline Server); clients: Hbase, HDFS, Hive, MR2, Pig, Slider, Spark, Tez, yarn, zookeeper

All the services are currently brought down for troubleshooting purposes. Though zookerper should startup clean first.

Thanks,

Alex

Got it, and what about the ambari-agent log on those hosts? Is it showing something? Like Errors or Warnings?
And could you check the first lines of the zookeeper out and log?

head -1000 /var/log/zookeeper/zookeeper.log
head -1000 /var/log/zookeeper/zookeeper-zookeeper-server-<ZK-HOST>.out

Contributor

zk-first-last-1000-log-out.tar.gz@Vinicius Higa Murakami

ambari-agent log mostly was complaining on node managers which is legit as they are down. I put all the services into a maintenance mode in hopes to suppress that 'legit' noise. I did not see anything suspicious yet and specifically for zookeeper either.

As for zookeeper.log and zookeeper.out in the failing nodes like I said earlier it has stopped writing new entries during zookeeper startup. The last entries observed in those files were of July 19th. Just in case I'm attaching first and last 1000 lines for log and out files from one of the failing nodes. But all in all not sure if that information may be relevant for now.

Thanks,

Alex

Hi @Alex Goron!
Could check if your zk nodes have the right correlated ID?
Here's mine e.g.

[root@node3 ~]# cat /usr/hdf/current/zookeeper-server/conf/zoo.cfg 


#
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.
#
#
#


clientPort=2181
initLimit=10
autopurge.purgeInterval=24
syncLimit=5
tickTime=3000
dataDir=/hadoop/zookeeper
autopurge.snapRetainCount=30
server.1=node1:2888:3888
server.2=node2:2888:3888
server.3=node3:2888:3888


[root@node3 ~]# cat /hadoop/zookeeper/myid 
3

Hope this helps!