Member since
06-28-2017
279
Posts
43
Kudos Received
24
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2020 | 12-24-2018 08:34 AM | |
5400 | 12-24-2018 08:21 AM | |
2252 | 08-23-2018 07:09 AM | |
9813 | 08-21-2018 05:50 PM | |
5192 | 08-20-2018 10:59 AM |
05-14-2018
06:36 PM
there are always pairs of *.log and *.index files. I would delete/move both together
... View more
05-14-2018
10:47 AM
just be sure, the logs do contain the messages from the kafka topics. If you delete them, they are simply gone, which can break the principal of guaranteed delivery. The logs I would delete (or maybe you save them somewhere else), would be the one from the topic causing the issue, which seems to be /var/kafka/kafka-logs/wctpi.avro.pri.processed-59, and within this topic the log file /var/kafka/kafka-logs/wctpi.avro.pri.processed-59/00000000000124356738.index So i would first try to delete/move the files /var/kafka/kafka-logs/wctpi.avro.pri.processed-59/00000000000124356738* try to restart the broker. If it works fine no more to do, but if you have still issue, move all files below /var/kafka/kafka-logs/wctpi.avro.pri.processed-59/ and start again. After this kafka should be starting again.
... View more
05-14-2018
10:41 AM
first check if you have a *.index.swap file at all in the logs dir. If so move it and start the broker
... View more
05-14-2018
10:27 AM
Mentioned also here: https://issues.apache.org/jira/browse/KAFKA-4502 Perhaps this could be helpful for you, it is mentioned that a leftover *index.swap file is causing the issue.
... View more
05-14-2018
10:21 AM
I am sure the message is the reason why you can't start the broker, as it gets shutted down again during startup. The docs state that this exception is "Thrown when the offset for a set of partitions is invalid (either undefined or out of range), and no reset policy has been configured." One solution is to shutdown the broker and delete the log, but it will create a data loss, so maybe not what you need. Some hints I found on resetting offsets (but always resetting consumer offsets, not producer, so maybe not helpful in your case): https://community.hortonworks.com/articles/81357/manually-resetting-offset-for-a-kafka-topic.html https://gist.github.com/marwei/cd40657c481f94ebe273ecc16601674b
... View more
05-09-2018
02:30 PM
A join itself is not implemented by HBase: see here: http://hbase.apache.org/0.94/book/joins.html or https://community.hortonworks.com/questions/29295/hbase-for-joins.html You'll have to consider alternatives, spark-scala is one of it. Normally you don't store tables in any way normalized in Hbase.
... View more
05-05-2018
08:27 AM
1 Kudo
Is your cluster directly connected to the internet, so that any internet user can connect to your port 8088? And also your cluster is not kerberized? There are regulary running kind of campaigns to search for unprotected or vulnerable services via Internet, so it shouldn't surprise that the attack is almost simultaneously hitting several clusters. There are even search engines available that will list you all services reachable from the internet, so that one can search for 'give me all unprotected hadoop machines'. If your cluster is unprotected, the only solution will be to protect it, via firewall, via kerberos etc...
... View more
05-04-2018
03:03 PM
Hive does submit M&R jobs to process the files. So the sequence is quite clear, the M&R jobs get stopped and as a result your hive query fails as well. The error message indicates, that the job was waiting for an I/O channel, and got interrupted. What I can't say for sure if the M&R job got interrupted due to the replication taking place in your cluster or for any other reasons.
... View more
05-04-2018
12:19 PM
1 Kudo
if you decommissioned 2 out of 3 data nodes, you only have one node left? In this case everything must execute on this single data node, which will have performance impact. If you still have three nodes left and the replication is ongoing for almost all your files you will have massive network load impacting for sure also any queries. During this rebalancing you will experience slow responses. The relative impact is lower if you have 100 nodes and 2 get decommissioned, but it is still there.
... View more
05-02-2018
06:00 PM
1 Kudo
if you are using the command line client you can pass the argument '--from-beginning', if you have written your own application, you can simply provide the offset a a parameter.
... View more