About arald

arald · ‎05-14-2018

there are always pairs of *.log and *.index files. I would delete/move both together

arald · ‎05-14-2018

just be sure, the logs do contain the messages from the kafka topics. If you delete them, they are simply gone, which can break the principal of guaranteed delivery. The logs I would delete (or maybe you save them somewhere else), would be the one from the topic causing the issue, which seems to be /var/kafka/kafka-logs/wctpi.avro.pri.processed-59, and within this topic the log file /var/kafka/kafka-logs/wctpi.avro.pri.processed-59/00000000000124356738.index So i would first try to delete/move the files /var/kafka/kafka-logs/wctpi.avro.pri.processed-59/00000000000124356738* try to restart the broker. If it works fine no more to do, but if you have still issue, move all files below /var/kafka/kafka-logs/wctpi.avro.pri.processed-59/ and start again. After this kafka should be starting again.

arald · ‎05-14-2018

first check if you have a *.index.swap file at all in the logs dir. If so move it and start the broker

arald · ‎05-14-2018

Mentioned also here: https://issues.apache.org/jira/browse/KAFKA-4502 Perhaps this could be helpful for you, it is mentioned that a leftover *index.swap file is causing the issue.

arald · ‎05-14-2018

I am sure the message is the reason why you can't start the broker, as it gets shutted down again during startup. The docs state that this exception is "Thrown when the offset for a set of partitions is invalid (either undefined or out of range), and no reset policy has been configured." One solution is to shutdown the broker and delete the log, but it will create a data loss, so maybe not what you need. Some hints I found on resetting offsets (but always resetting consumer offsets, not producer, so maybe not helpful in your case): https://community.hortonworks.com/articles/81357/manually-resetting-offset-for-a-kafka-topic.html https://gist.github.com/marwei/cd40657c481f94ebe273ecc16601674b

arald · ‎05-09-2018

A join itself is not implemented by HBase: see here: http://hbase.apache.org/0.94/book/joins.html or https://community.hortonworks.com/questions/29295/hbase-for-joins.html You'll have to consider alternatives, spark-scala is one of it. Normally you don't store tables in any way normalized in Hbase.

arald · ‎05-05-2018

Is your cluster directly connected to the internet, so that any internet user can connect to your port 8088? And also your cluster is not kerberized? There are regulary running kind of campaigns to search for unprotected or vulnerable services via Internet, so it shouldn't surprise that the attack is almost simultaneously hitting several clusters. There are even search engines available that will list you all services reachable from the internet, so that one can search for 'give me all unprotected hadoop machines'. If your cluster is unprotected, the only solution will be to protect it, via firewall, via kerberos etc...

arald · ‎05-04-2018

Hive does submit M&R jobs to process the files. So the sequence is quite clear, the M&R jobs get stopped and as a result your hive query fails as well. The error message indicates, that the job was waiting for an I/O channel, and got interrupted. What I can't say for sure if the M&R job got interrupted due to the replication taking place in your cluster or for any other reasons.

arald · ‎05-04-2018

if you decommissioned 2 out of 3 data nodes, you only have one node left? In this case everything must execute on this single data node, which will have performance impact. If you still have three nodes left and the replication is ongoing for almost all your files you will have massive network load impacting for sure also any queries. During this rebalancing you will experience slow responses. The relative impact is lower if you have 100 nodes and 2 get decommissioned, but it is still there.

arald · ‎05-02-2018

if you are using the command line client you can pass the argument '--from-beginning', if you have written your own application, you can simply provide the offset a a parameter.

Online	Offline
Last Visited	‎08-19-2019 03:23 AM

Member Since	‎06-28-2017 06:04 AM
Last Visited	‎08-19-2019 03:23 AM
Posts	279
Kudos received	43

Cloudera Community

Re: secured nifi cluster must import a cert to bro...

Re: Nifi Epoch conversion not working?

Re: Scenario when we store data in HBase and acce...

Re: Setup environment variables in NiFi cluster se...

Re: CREATE EXTERNAL HIVE TABLE on existing HBASE T...

Re: kafka + kafka shutdown because InvalidOffsetEx...

Re: kafka + kafka shutdown because InvalidOffsetEx...

Re: kafka + kafka shutdown because InvalidOffsetEx...

Re: kafka + kafka shutdown because InvalidOffsetEx...

Re: kafka + kafka shutdown because InvalidOffsetEx...

Re: Joining tables in hbase

Re: Why are there dr.who "MYYARN" applications run...

Re: relationship between Hive query and missing bl...

Re: relationship between Hive query and missing bl...

Re: Kafka bootstrap consumer offset reset