Member since
05-07-2018
331
Posts
45
Kudos Received
35
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6591 | 09-12-2018 10:09 PM | |
2597 | 09-10-2018 02:07 PM | |
8871 | 09-08-2018 05:47 AM | |
2882 | 09-08-2018 12:05 AM | |
3875 | 08-15-2018 10:44 PM |
06-13-2018
04:43 PM
Hey @Karthik Chandrashekhar! Hm, but these blocks belongs to your dfs.datanode.data.dir parameter? If so, they should belong to DFS not NON-DFS. Cause AFAIK, any data outside of hdfs and written in the same mount disk as dfs.datanode.data.dir path is considered as non-DFS. If these blocks doesn't belong to your DFS (NON-DFS) and they're in the same path as your dfs.datanode.data.dir value. Then, we might have an issue there 😞 Btw, could you check your mount points as well? Hope this helps!
... View more
06-13-2018
03:36 PM
Hi @Marc Vázquez! Could you check which command does Confluent stack runs for zk? [root@node2 ~]# ps -ef | grep -i zookeeper 1001 3802 1 0 Jun12 ? 00:00:56 /usr/jdk64/jdk1.8.0_112/bin/java -Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.log.file=zookeeper-zookeeper-server-node2.log -Dzookeeper.root.logger=INFO,ROLLINGFILE -cp #Thousands of libs... -Xmx1024m -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.local.only=false org.apache.zookeeper.server.quorum.QuorumPeerMain /usr/hdf/current/zookeeper-server/conf/zoo.cfg
ps: did you set this cluster using the confluent.sh? If so, i made a research at their code and it should exists an directory for zk logs 😞 https://github.com/confluentinc/confluent-cli/blob/master/src/oss/confluent.sh#L414 Or if you prefer, you can try to set it manually, here's my example of log4j.properties for ZK. [root@node2 ~]# cat /etc/zookeeper/conf/log4j.properties
#
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
#
#
#
# ZooKeeper Logging Configuration
#
# DEFAULT: console appender only
log4j.rootLogger=INFO, CONSOLE, ROLLINGFILE
# Example with rolling log file
#log4j.rootLogger=DEBUG, CONSOLE, ROLLINGFILE
# Example with rolling log file and tracing
#log4j.rootLogger=TRACE, CONSOLE, ROLLINGFILE, TRACEFILE
#
# Log INFO level and above messages to the console
#
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.Threshold=INFO
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} - %-5p [%t:%C{1}@%L] - %m%n
#
# Add ROLLINGFILE to rootLogger to get log file output
# Log DEBUG level and above messages to a log file
log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.ROLLINGFILE.Threshold=DEBUG
log4j.appender.ROLLINGFILE.File=/var/log/zookeeper/zookeeper.log
# Max log file size of 10MB
log4j.appender.ROLLINGFILE.MaxFileSize=10MB
# uncomment the next line to limit number of backup files
#log4j.appender.ROLLINGFILE.MaxBackupIndex=10
log4j.appender.ROLLINGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.ROLLINGFILE.layout.ConversionPattern=%d{ISO8601} - %-5p [%t:%C{1}@%L] - %m%n
#
# Add TRACEFILE to rootLogger to get log file output
# Log DEBUG level and above messages to a log file
log4j.appender.TRACEFILE=org.apache.log4j.FileAppender
log4j.appender.TRACEFILE.Threshold=TRACE
log4j.appender.TRACEFILE.File=zookeeper_trace.log
log4j.appender.TRACEFILE.layout=org.apache.log4j.PatternLayout
### Notice we are including log4j's NDC here (%x)
log4j.appender.TRACEFILE.layout.ConversionPattern=%d{ISO8601} - %-5p [%t:%C{1}@%L][%x] - %m%n
And don't forget to fill the zookeeper-env.sh [root@node2 ~]# cat /etc/zookeeper/conf/zookeeper-env.sh
export JAVA_HOME=/usr/jdk64/jdk1.8.0_112
export ZOOKEEPER_HOME=/usr/hdf/current/zookeeper-server
export ZOO_LOG_DIR=/var/log/zookeeper
export ZOOPIDFILE=/var/run/zookeeper/zookeeper_server.pid
export SERVER_JVMFLAGS=-Xmx1024m
export JAVA=$JAVA_HOME/bin/java
export CLASSPATH=$CLASSPATH:/usr/share/zookeeper/*
Hope this helps!
... View more
06-12-2018
05:21 PM
Hey @Karthik Chandrashekhar! I'm not sure if i get you right, but my advice would be to not delete these files. It belongs to HDFS Datanode, the blk_12345 dir carries some blocks+meta = data stored in HDFS. If you want to know which file belongs to which block, you can use the following command: [hdfs@node2 ~]$ cd /hadoop/hdfs/data/current/BP-686380642-172.25.33.129-1527546468579/current/finalized/subdir0/subdir0/
[hdfs@node2 subdir0]$ ls | head -2
blk_1073741825
blk_1073741825_1001.meta
[hdfs@node2 ~]$ hdfs fsck / -files -locations -blocks -blockId blk_1073741825
Connecting to namenode via http://node3:50070/fsck?ugi=hdfs&files=1&locations=1&blocks=1&blockId=blk_1073741825+&path=%2F
FSCK started by hdfs (auth:SIMPLE) from /MYIP at Tue Jun 12 14:54:08 UTC 2018
Block Id: blk_1073741825
Block belongs to: /hdp/apps/2.6.4.0-91/mapreduce/mapreduce.tar.gz
No. of Expected Replica: 3
No. of live Replica: 3
No. of excess Replica: 0
No. of stale Replica: 0
No. of decommissioned Replica: 0
No. of decommissioning Replica: 0
No. of corrupted Replica: 0
Block replica on datanode/rack: node2/default-rack is HEALTHY
Block replica on datanode/rack: node3/default-rack is HEALTHY
Block replica on datanode/rack: node4/default-rack is HEALTHY
Hope this helps! 🙂
... View more
06-12-2018
03:14 PM
Hey @pradeep arumalla! I'm not a specialist in coding or spark, but did you tried to change your groupByKey for reduceByKey (at lhe last line)? And about the executors --num-executors, how are you launching your job, is it by spark-submit? Could you share with us? BTW: here's some links about shuffling 🙂 https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-rdd-shuffle.html https://0x0fff.com/spark-architecture-shuffle/ Hope this helps!
... View more
06-12-2018
02:36 PM
Hey @Marc Vázquez! Usually you will find a file named log4j.properties and there should exist a parameter called log4j.appender.ROLLINGFILE.File=<path>/zookeeper.log. [root@node1 conf]# cat /etc/zookeeper/conf/log4j.properties | grep -i zookeeper.log
# ZooKeeper Logging Configuration
log4j.appender.ROLLINGFILE.File=/var/log/zookeeper/zookeeper.log
In confluent stack, i'm not sure about the path, but should it be something like: confluent_dss_version/etc/zookeeper/conf/log4j.properties. Or you can try to search for it, like: find / -name "zookeeper" -type d -exec ls -ltrah {} \; Hope this helps! 🙂
... View more
06-12-2018
05:50 AM
Hey @Rahul Kumar. Just asking, but after you stopped kafka/zookeeper, did you tried to produce and consume messages again? For example, let's say that you just did a kafka-console-consumer after 7 days, probably you won't be able to see that messages again on that topic, because Kafka has a parameter that retains messages for a determined period of time, which is log.retention.hours = 168 hours (7 days) by default (you can change it). But, if you did the whole process again (create a topic, kafka-console-producer and kafka-console-consumer) after the kafka cluster was down, then we may need to take a look at the errors from the logs of Kafka/ZK and watch the consumer groups/offsets. Hope this helps!
... View more
06-12-2018
05:19 AM
Hey @JAy PaTel! I see, could you share the output from the following command? C:\InputFileWindows>scp -v -p 2222 datafile.txt root@localhost:/ Thanks!
... View more
06-11-2018
05:45 PM
Hey @Rahul Kumar! How much is set for log.retention.hours? And could you check if your kafka-console-consumer are creating a consumer group? [root@node1 ~]# kafka-consumer-groups.sh --zookeeper $ZKENSEMBLE --list
#Or [root@node1 ~]# kafka-consumer-groups.sh --bootstrap-server node1:6667 --list #If shows smtg, try to describe it with the --group <consumer-group-mynumber> and --describe
#And we can also check the offset of your topic. root@node1 ~]# kafka-run-class.sh kafka.tools.ExportZkOffsets --zkconnect $ZKENSEMBLE --output-file zk_offset_kafka
[root@node1 ~]# kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list node1:6667 --topic vin-hcc-nifi --time -1 #latest
vin-hcc-nifi:2:1
vin-hcc-nifi:1:1
vin-hcc-nifi:0:1
[root@node1 ~]# kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list node1:6667 --topic vin-hcc-nifi --time -2 #earliest
vin-hcc-nifi:2:0
vin-hcc-nifi:1:0
vin-hcc-nifi:0:0
Hope this helps!
... View more
06-11-2018
05:17 PM
Hey @JAy PaTel! Did you tried to connect using the port 2222? And if isn't, could you add the -v(verbose) parameter to your scp command? Hope this helps!
... View more
06-11-2018
03:21 PM
Hey @Simran kaur! Could you check if you're able to connect to all zk ports? And what about your datadir property (inside /etc/zookeeper/conf/zoo.cfg), could you check your permissions there? clientPort=2181
initLimit=10
autopurge.purgeInterval=24
syncLimit=5
tickTime=3000
dataDir=/hadoop/zookeeper
autopurge.snapRetainCount=30
server.1=mynode1:2888:3888
server.2=mynode2:2888:3888
server.3=mynode3:2888:3888 Hope this helps!
... View more