Member since
11-19-2015
158
Posts
25
Kudos Received
21
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
15411 | 09-01-2018 01:27 AM | |
1978 | 09-01-2018 01:18 AM | |
5989 | 08-20-2018 09:39 PM | |
1017 | 07-20-2018 04:51 PM | |
2635 | 07-16-2018 09:41 PM |
07-30-2018
06:53 PM
@Michael Bronson - The terms "master/worker" don't really mean anything in Kafka terms. 17 Kafka brokers seems like a lot (we have about that many brokers in AWS handling about 2million messages per day), but yes, a minimum of 5 ZKs is encouraged to account for maintenance and hardware failure, as mentioned.
... View more
07-25-2018
06:58 PM
1 Kudo
There is no such rule for Kafka Brokers. Zookeeper should maintain a quorum or (n/2 + 1) total machines (of n) that agree on leader-election values and locks, that results in a total odd number to accommodate for hardware and network failure scenarios. From "Kafka - The Definitive Guide", as well as Apache Zookeeper site, you generally will have negative side effects from having more than 5 or 7 Zookeeper servers total serving applications using it. You should have more than 3 Zookeepers because if one goes down, you are only left with 2, which results in that "split brain". With 5 servers, two can go down, and you still have 2 servers + 1 available for the "tie breaker" vote. For 7, you can loose up to 4 zookeepers and still be good.
... View more
07-20-2018
04:52 PM
TaskTracker & JobTracker doesn't exist with YARN. The default replication factor is 3.
... View more
07-20-2018
04:51 PM
1 Kudo
What component are you asking about? What are you trying to achieve? They typically call each other over combinations of separate protocols. - HDFS and YARN interact via RPC/IPC. - Ambari Server and Agents are over HTTP & REST. Ambari also needs JDBC connections to the backing database. - Hive, Hbase, and Spark can use Thrift Server. The Hive metastore uses JDBC. - Kafka has its own TCP protocol. I would suggest starting on a specific component for the use case(s) you want. Hadoop itself is only comprised of HDFS & YARN + MapReduce
... View more
07-16-2018
09:41 PM
1 Kudo
@Sambasivam Subramanian
By definition, an edge node is just a host only with clients installed and configured. If you install no server services in Ambari for a host, then you will end up with an edge node for the clients that you selected.
... View more
05-19-2018
05:13 AM
The configs are on the top line. It will say "Configs: " if none are customized $ kafka-topics --describe --topic $TOPIC --zookeeper $ZOOKEEPER
Topic:******** PartitionCount:20 ReplicationFactor:3 Configs:retention.ms=10800000
... View more
05-19-2018
05:10 AM
There is no such support for renaming https://issues.apache.org/jira/browse/KAFKA-2333 If you want to clone, then use MirrorMaker https://community.hortonworks.com/articles/79891/kafka-mirror-maker-best-practices.html
... View more
05-14-2018
03:31 AM
@Michael Bronson Kafka stores the latest offsets in memory before they are sent to disk, therefore, the more memory the better, with a max of 8G. And I would assume that the heap properties can be set from Ambari rather than individually on the broker, but I don't use Kafka from HDP, so I can't say.
... View more
05-11-2018
01:16 AM
1 Kudo
The recommendation here would be to increase the heap space allocated to the Kafka process or reduce the amount of other processes running on the same server. For example, in a production environment, the Kafka brokers should be standalone servers -- not on the same hardware as Zookeeper or other Hadoop processes.
... View more
04-10-2018
08:35 PM
Yes, the commands work the same assuming you have winutils.exe on your PATH as well as HADOOP_HOME and HADOOP_CONF_DIR defined as environment variables. Windows is not as stable or as supported as Linux, however.
... View more