Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Master Guru

After a rolling or express upgrade of HDP from 2.2.x to 2.4 (and I'm told also to 2.3.2 and 2.3.4) you may face some issues with your Kafka as I did. In my case HDP was upgraded from HDP-2.2.6 (Kafka-0.8.1) to HDP-2.4 (Kafka-0.9.0) using Ambari-2.2.1.1. Here are 4 items to check and fix if needed.

  • After the upgrade and Kafka restart, brokers IDs are set to 1001, 1002, ... However, topics created before the upgrade point to brokers numbered 0, 1, ... For example (k81c was created before the upgrade, k9a after):
$ ./kafka-topics.sh --zookeeper zk1.example.com:2181 --describe --topic k81c                
Topic:k81c     PartitionCount:2        ReplicationFactor:2     Configs:
        Topic: k81c    Partition: 0    Leader: 0       Replicas: 0,1   Isr: 0
        Topic: k81c    Partition: 1    Leader: 0       Replicas: 1,0   Isr: 0
$ ./kafka-topics.sh --zookeeper zk1.example.com:2181 --describe --topic k9a
Topic:k9a       PartitionCount:2        ReplicationFactor:1     Configs:
        Topic: k9a      Partition: 0    Leader: 1002    Replicas: 1002  Isr: 1002
        Topic: k9a      Partition: 1    Leader: 1001    Replicas: 1001  Isr: 1001

Newly created topics work, but old ones don't. A solution which worked for me was to change topic.id in newly created kafka-logs/meta.properties to old values. This has to be done on all volumes of all brokers. If your Kafka log volumes are, for example /data-01, ..., /data-06 you can change them by running

$ sed -i 's/1001/0/' /data-*/kafka-logs/meta.properties    # run this on each broker
$ grep broker.id /data-*/kafka-logs/meta.properties        # to confirm they changed

It's a good idea to mark original broker IDs before the upgrade. They can be found in /etc/kafka/conf/server.properties.

  • If you are running Kafka on a custom port different than default 6667, make sure the "listeners" property is set to your port. In the new version the "port" property is deprecated. If your port is for example 9092, set "listeners" to "PLAINTEXT://localhost:9092"
  • If your Kafka is installed on dedicated nodes running only Kafka but not running Data node in a full-scale cluster which includes HDFS you may get an error saying that hadoop-client/conf cannot be found. This is possibly a bug in old Ambari used to install the original HDP and Kafka, as I found /etc/hadoop/conf on those broker nodes. A solution which worked for me was to create hadoop-client/conf structure by running the script below. By the way, on Kafka brokers running in a stand-alone cluster without HDFS I didn't have this error.
hdpver=2.4.0.0-169        # set your HDP target version
mkdir -p /usr/hdp/$hdpver/hadoop
mkdir -p /etc/hadoop/$hdpver/0
ln -s /etc/hadoop/$hdpver/0 /usr/hdp/$hdpver/hadoop/conf
hdp-select set hadoop-client $hdpver   # /usr/hdp/current/hadoop-client -> /usr/hdp/$hdpver/hadoop
cp /etc/hadoop/conf/* /etc/hadoop/$hdpver/0       # copy conf files from the previous location
ln -sfn /usr/hdp/current/hadoop-client/conf /etc/hadoop/conf    # update /etc/hadoop/conf symlink 
  • If your Kafka is not kerberized some Kafka scripts located in /usr/hdp/current/kafka-broker/bin/ won't work. To fix them comment out Kerberos related commands from them on all brokers.
sed -i '/^export KAFKA_CLIENT_KERBEROS_PARAMS/s/^/# /' /usr/hdp/current/kafka-broker/bin/*.sh
grep "export KAFKA_CLIENT_KERBEROS" /usr/hdp/current/kafka-broker/bin/*.sh       # to confirm
5,301 Views
Comments
avatar
New Contributor

Hi,

Did you see any issues with Kafka failing to read Snappy compressed messages?

I'm seeing lots of this since the upgrade..

ERROR [Replica Manager on Broker 1001]: Error processing append operation on partition [testTopic,1] (kafka.server.ReplicaManager) kafka.common.KafkaException: at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:94) at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:64) at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66) at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)

...

Caused by: java.io.IOException: failed to read chunk at org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:416) at org.xerial.snappy.SnappyInputStream.rawRead(SnappyInputStream.java:182) at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:163) at java.io.DataInputStream.readFully(DataInputStream.java:195) at java.io.DataInputStream.readLong(DataInputStream.java:416) at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:72)

avatar
Master Guru

Hi @Stephen Redmond, sorry I missed your comment. No, haven't done tests with compression. I'll let you know if I find something. Also, you can file a question on HCC, copying your comment, to get wider attention. Tnx.