Member since
08-08-2017
1652
Posts
30
Kudos Received
11
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 2093 | 06-15-2020 05:23 AM | |
| 17446 | 01-30-2020 08:04 PM | |
| 2255 | 07-07-2019 09:06 PM | |
| 8725 | 01-27-2018 10:17 PM | |
| 4913 | 12-31-2017 10:12 PM |
05-11-2018
04:42 AM
@Jordan , regarding to what you said - "increase the heap space allocated to the Kafka process" can you give example of the parameter? , so we can find it from ambari GUI , second yes this kafka broker is stand alone machine and not with the Zookeeper
... View more
05-10-2018
04:57 PM
we have in our Hadoop cluster 3 kafka brokers ( based on ambari ) one of the kafka broker can’t starting ( kafka01 ) any suggestion for this situation ? we have the following logs from
/var/log/kafka/server.log from /var/log/kafka/server.log: FATAL Fatal error<br>
during KafkaServer shutdown. (kafka.server.KafkaServer)java.lang.IllegalStateException: Kafka server is still
starting up, cannot shut down! at
kafka.server.KafkaServer.shutdown(KafkaServer.scala:576) at
kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:51) at
kafka.Kafka$$anon$1.run(Kafka.scala:63)[2018-05-10 14:23:57,032] FATAL Fatal error during
KafkaServerStable shutdown. Prepare to halt (kafka.server.KafkaServerStartable)java.lang.IllegalStateException: Kafka server is still
starting up, cannot shut down! at
kafka.server.KafkaServer.shutdown(KafkaServer.scala:576) at
kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:51) at
kafka.Kafka$$anon$1.run(Kafka.scala:63)[2018-05-10 14:23:59,867] INFO KafkaConfig values: from /var/log/kafka/kafka.err Exception in thread "metrics-meter-tick-thread-3"
java.lang.OutOfMemoryError: Java heap spaceException in thread "metrics-meter-tick-thread-2"
java.lang.OutOfMemoryError: Java heap spaceException in thread "metrics-meter-tick-thread-5"
java.lang.OutOfMemoryError: Java heap spaceException: java.lang.OutOfMemoryError thrown from the
UncaughtExceptionHandler in thread
"kafka-socket-acceptor-PLAINTEXT-6667"Exception in thread "metrics-meter-tick-thread-4"
java.lang.OutOfMemoryError: Java heap spaceException in thread "metrics-meter-tick-thread-7"
java.lang.OutOfMemoryError: Java heap spaceException in thread "metrics-meter-tick-thread-6"
java.lang.OutOfMemoryError: Java heap space we not get any output from the following: ( port isn't licensing ) netstat -tnlpa | grep 6667
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
-
Apache Kafka
05-01-2018
02:35 PM
just for summary ( this is production system ) the final steps are
hdfs fsck / -delete and if "step 1" not fixed the corrupted blocks then we need to remove the file as: hdfs fs -rm /localF/STRZONEZone/intercept_by_country/2018/4/10/16/2018_4_10_16_45.parquet/part-00003-8600d0e2-c6b6-49b7-89cd-ef2a2bc1dc5e.snappy.parquet is it correct ?
... View more
04-30-2018
04:00 PM
we have ambari cluster with HDP version 26 ( production system ) when we run the following command in order to verify which files have corrupted blocks hdfs fsck / |egrep -v '^\.+$' | grep -v replica | grep -v Replica we get: /localF/STRZONEZone/intercept_by_country/2018/4/10/16/2018_4_10_16_45.parquet/part-00003-8600d0e2-c6b6-49b7-89cd-ef2a2bc1dc5e.snappy.parquet: CORRUPT blockpool BP-338831142-28.12.45.6-1508451686931 block blk_1097240348
/localF/STRZONEZone/intercept_by_country/2018/4/10/16/2018_4_10_16_45.parquet/part-00003-8600d0e2-c6b6-49b7-89cd-ef2a2bc1dc5e.snappy.parquet: MISSING 1 blocks of total size 1192 B...........................................
/localF/STRZONEZone/intercept_by_type/2018/4/10/16/2018_4_10_16_45.parquet/part-00002-be0f80a9-2c7c-4c50-b18d-73be372acff.snappy.parquet: CORRUPT blockpool BP-338831142-28.12.45.6-1508451686931 block blk_1097240344
/localF/STRZONEZone/intercept_by_type/2018/4/10/16/2018_4_10_16_45.parquet/part-00002-be0f80a9-2c7c-4c50-b18d-73be372acff.snappy.parquet: MISSING 1 blocks of total size 1098 B...............................................
..................................Status: CORRUPT
Total size:7072689634566 B (Total open files size: 293676105509 B)
Total dirs:32330710
Total files:910568034
Total symlinks:0 (Files currently being written: 12)
Total blocks (validated):10183608 (avg. block size 6254517 B) (Total open file blocks (not validated): 2200)
********************************
UNDER MIN REPL'D BLOCKS:2 (1.9345605E-5 %)
CORRUPT FILES:2
MISSING BLOCKS:2
MISSING SIZE:2290 B
CORRUPT BLOCKS: 2
********************************
Corrupt blocks:2
Number of data-nodes:35
Number of racks:1
FSCK ended at Mon Apr 20 11:40:50 UTC 2018 in 241684 milliseconds
The filesystem under path '/' is CORRUPT
in this case that we see : CORRUPT FILES:2
MISSING BLOCKS:2 what is the right action to do ? , or corrupted blocs solutuion ?
... View more
Labels:
04-30-2018
10:01 AM
about - Force fsck for all other non-root partitions , in that case as you explained each reboot will activate the fsck check on that partition , but can we schedule the fsck for "all other non-root partitions" ? ( I mean not only by reboot , we want to run fsck each month for example for non-root partitions )
... View more
04-29-2018
07:56 AM
regarding what you said "but using tune2fs to adjust the check schedule if appropriate, and forcing fsck to run when it is more convenient." , can yopu give example for this configuration ?
... View more
04-29-2018
06:19 AM
ok I have another question , as you know we set in the /etc/fstab file the partition as sdf , to perform fsck during reboot , for now all machine s set without fsck dusring reboot , so do you recomended to set it to "1" , in order to perform fsck during reboot ?
... View more
04-25-2018
08:05 PM
yes we already did it on one of the disks , see please - https://community.hortonworks.com/questions/189016/datanode-machine-worker-one-of-the-disks-have-file.html
... View more
04-25-2018
08:03 PM
I want to say additionally , that we use DNS server for manage all hosts in the cluster
... View more
04-25-2018
08:01 PM
we ask about this service , because we think the data-node machines have issues when this service is ruining , we see unexpected reboot on workers machines , and million messages in /var/log/messages file about avahi-daemon , so lets minimze the quastion can we be sure about disable this service only on worklers machines ?
... View more