Support Questions

Find answers, ask questions, and share your expertise

how to determine java heep size on kafka machines

avatar


we have Hadoop cluster ( version 2.6.4 ) with 3 physical kafka machines

we want to know what is the values of Xmx and Xms that we need to allocate on the kafka machines

in order to setup the in the Xmx and Xms on kafka machine we need to configure the script

/usr/hdp/2.6.4 /kafka/bin/kafka-server-start

for now the default values are ( Xmx is 1G and Xms is 1G )

if [ "x$KAFKA_HEAP_OPTS" = "x" ]; then
export
KAFKA_HEAP_OPTS="-Xmx1G –Xms1G"
fi

[root@kafka01 ~]# free -g
              total        used        free      shared  buff/cache   available
Mem:            251          17          11           0         222         233
Swap:            15           6           9


on each kafka we have 256G

and we found this link - https://stackoverflow.com/questions/4667483/how-is-the-default-java-heap-size-determined

according to the link in stackoverflow we can use this formula:

[root@kafka01 ~]# java -XX:+PrintFlagsFinal -version | grep HeapSize
    uintx ErgoHeapSizeLimit                         = 0                                                                                                                {product}
    uintx HeapSizePerGCThread                       = 87241520                                                                                                         {product}
    uintx InitialHeapSize                          := 2147483648                                                                                                       {product}
    uintx LargePageHeapSizeThreshold                = 134217728                                                                                                        {product}
    uintx MaxHeapSize                              := 32210157568                                                                                                      {product}
openjdk version "1.8.0_65"
OpenJDK Runtime Environment (build 1.8.0_65-b17)
OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode)
[root@kafka01 ~]# java -XX:+PrintFlagsFinal -version | grep HeapSize
    uintx ErgoHeapSizeLimit                         = 0                                   {product}
    uintx HeapSizePerGCThread                       = 87241520                            {product}
    uintx InitialHeapSize                          := 2147483648                          {product}
    uintx LargePageHeapSizeThreshold                = 134217728                           {product}
    uintx MaxHeapSize                              := 32210157568                         {product}
openjdk version "1.8.0_65"
OpenJDK Runtime Environment (build 1.8.0_65-b17)
OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode)

So the values comes with bytes and we can calculate them to Giga

So

Xmx = 32G

Xms = 2G

What is the recommendation of hortonworks about Xmx and Xms ?

Dose hortonworks except the formula - java -XX:+PrintFlagsFinal -version | grep HeapSize

In order to get the right values of Xmx and Xms ?

Michael-Bronson
12 REPLIES 12

avatar
Master Mentor

@Michael Bronson

The command that you are using just shows the values that the JVM has picked the default values and has started with the listed values

[root@kafka01 ~]# java -XX:+PrintFlagsFinal -version | grep HeapSize

.

However as you are seeing 32 GB which looks strange because JVM does not start with that huge value by default until you have some Environment variable defined globally like "_JAVA_OPTIONS" Or "JAVA_OPTIONS"

So please check the output of the same command after unsetting some global variable settings.

[root@kafka01 ~]# unset _JAVA_OPTIONS
[root@kafka01 ~]# unset JAVA_OPTIONS
[root@kafka01 ~]# java -XX:+PrintFlagsFinal -version | grep HeapSize

.

avatar

hi Jay

we get the same values: ( after unset )

 java -XX:+PrintFlagsFinal -version | grep HeapSize
    uintx ErgoHeapSizeLimit                         = 0                                   {product}
    uintx HeapSizePerGCThread                       = 87241520                            {product}
    uintx InitialHeapSize                          := 2147483648                          {product}
    uintx LargePageHeapSizeThreshold                = 134217728                           {product}
    uintx MaxHeapSize                              := 32210157568                         {product}
openjdk version "1.8.0_65"
OpenJDK Runtime Environment (build 1.8.0_65-b17)
OpenJDK 64-Bit Server VM (build 25.65-b01, mixed mode)
Michael-Bronson

avatar

$Jay , anyway what is your personal recommendation for Xmx and Xms values ? ( according to jafka memory size )

Michael-Bronson

avatar

@Jay I ask all these questions because seems that the default of 1G inst enough , and we have in the past critical errors as the following from kafka.err ( this was very big problem and causes restarting of kafka broker )

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "kafka-network-thread-1002-PLAINTEXT-2"
Exception in thread "ExpirationReaper-1002" Exception in thread "ExpirationReaper-1002" java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Kafka relies heavily on the filesystem for storing and caching messages. All data is immediately written to a persistent log on the filesystem without necessarily flushing to disk. Kafka uses heap space very carefully and does not require setting heap sizes more than 5GB. Kafka uses page cache memory as a buffer for active writers and readers, so after you specify JVM size (using -Xmx and -Xms Java options), leave the remaining RAM available to the operating system for page caching.

Add the "KAFKA_HEAP_OPTS" option inside the "Advanced kafka-env" to a larger value than 1GB (default) and then restart the kafka.
Ambari UI --> Configs --> Advanced --> "Advanced kafka-env" --> kafka-env template

# Set KAFKA specific environment variables here.
export KAFKA_HEAP_OPTS="$KAFKA_HEAP_OPTS -Xms3g -Xmx3g"

Please note the -Xmx3g and -Xmx3g is just a value i picked up. You can increase the value a bit more based on your requirement / GC log analysis.


then restart Kafka brokers.

Verify if it is taking correct settings or not as following:

# ps -ef | grep -i kafka

.

avatar
Master Mentor

In addition to above comment there is a very good article available for Kafka Brest practice & tuning you might want to refer to it:

1. https://community.hortonworks.com/articles/80813/kafka-best-practices-1.html

2. https://community.hortonworks.com/articles/80813/kafka-best-practices-2.html

avatar

@Jay why not to update the script - /usr/hdp/2.6.4 /kafka/bin/kafka-server-start on each kafka ? and change there the Xmx and Xms insted to add the new Xmx and Xms inside the ambari in kafka-env template

Michael-Bronson

avatar
Master Mentor
@Michael Bronson

Regarding your query: Why not to update the script - /usr/hdp/2.6.4 /kafka/bin/kafka-server-start on each kafka ?

>>> Editing the sh file on all broker host is also possible but error prown and requires manual efforts on all the hosts. Ambari provides a better option.

You have both the options. Ambari provides a better centralized way to control the configuration and also manage config history,

avatar

@Jay after I update the lines in ambari I get:

ps -ef | grep kafka<br>kafka     58614      1 99 11:49 ?        00:00:44 /usr/jdk64/jdk1.8.0_112/bin/java -Xms3g -Xmx3g -Xms3g -Xmx3g -Xms3g -Xmx3g

instead to get:

ps -ef | grep kafka<br>kafka     58614      1 99 11:49 ?        00:00:44 /usr/jdk64/jdk1.8.0_112/bin/java -Xms3g -Xmx3g 
Michael-Bronson