本文介绍了ZooKeeper在集群中的作用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我拥有1个主题的集群托管,该主题具有三个分区.因此ZooZeeper(ZK)群集托管了3个代理实例.

If I have cluster hosting 1 topic which has three partitions. So ZooKeeper(ZK) cluster hosting 3 broker instances.

根据我的理解

  1. 生产者将与ZooKeeper交互以在代理上发布消息.
  2. ZK将根据每个代理实例的负载在内部决定发布消息所需的分区.经纪人还将与ZK互动保持每个消费者实例的偏移量
  3. 类似地,消费者将与ZooKeeper交互以使用来自代理的消息.ZK将根据负载从正确的代理那里获取消息.

但是我在阅读队列消息传递/消费者组的工作流程部分中的粗体字后感到困惑在 kafka教程中.我的理解是错误的吗?根据以下内容,生产者/消费者似乎不会直接与Zookeeper进行交互.反正ZK与生产者/消费者互动的地方.如果是,谁(Zookeeper或代理)的哪个代理实例消息需要发布或使用?

But I got confused after reading below bold text from section Workflow of Queue Messaging / Consumer Group at kafka tutorial. Is mine understanding above wrong ? Based on below looks like producer/consumer does not interact directly with zookeeper. Is it otherway around where ZK interact with producer/consumer. If yes who(Zookeeper or broker) which broker instance message needs to be published or consumed ?

推荐答案

您似乎很困惑,因为您认为大多数由Kafka经纪人完成的事情实际上是由客户完成的,而您所做的大多数事情认为由Zookeeper完成实际上是由经纪人完成.

You seem to be very mixed up in that most of the things you think are done by Kafka brokers are actually done by the clients and that most of the things you think are done by Zookeeper are actually done by the brokers.

Kafka是一个非常可扩展的系统,因为客户端会执行很多处理.客户未完成的部分由代理(以及称为代理和协调器的特殊代理组件)完成.除了存储状态和代理的一些配置外,Zookeeper几乎不做任何其他事情(以一种非常可靠的方式)

Kafka is a very scalable system because the clients do a lot of the processing. The parts not done by the clients are done by the brokers (and the special broker components called the Controller and the Coordinators). Zookeeper does very little other than store state and some configuration for the brokers (in a very reliable way)

解决您的问题:

1)不正确.新的Producer不会直接与ZooKeeper交互.生产者直接与代理对话以发布消息或发出元数据请求,以查找哪个代理是要发布到的分区的领导者.

1) Incorrect. The new Producer does not interact directly with ZooKeeper. Producer talks directly to the brokers to publish messages or make meta-data requests to find which broker is the leader for a partition it wants to publish to.

2)不正确.ZK不会决定"任何事情.ZK是复制的容错存储系统,代理使用这些存储系统来保存集群的信息和状态.要发布到哪个分区的决定是在生产者中完成的,并且取决于要发布的消息的键和客户端分区器算法.不会根据负载分配分区,而是根据密钥(或者如果密钥为null)然后使用循环算法分配分区.代理将不会与ZK交互以维护每个使用者实例的偏移量.消费者跟踪自己的偏移量,并将它们存储(有时通过偏移量提交)存储在代理的_consumer_offsets主题中.

2) Incorrect. ZK does not "decide" anything. ZK is a replicated fault tolerant storage system that the brokers use to save information and state for the cluster. The decision on which partition to publish into is done in the Producer and depends on the key of the message being published and the client side partitioner algorithm. Partitions are not assigned based on load, they are assigned based on the key (or if the key is null) then using a round robin algorithm. The Broker will NOT interact with ZK to maintain offset per consumer instance. Consumers keep track of their own offsets and store them (occasionally, via offset commits) in the _consumer_offsets topic on the brokers.

3)不正确.新使用者将不会直接与ZooKeeper交互以使用来自代理的消息.ZK不会根据负载从正确的代理那里获取消息.消费者将直接与经纪人交谈,通过使用kafka协议发送给经纪人的RPC来加入和离开消费者组.

3) Incorrect. New Consumer will NOT directly interact with ZooKeeper to consume the message from broker. ZK will NOT get the message out from right broker based on load. Consumers will talk directly to the brokers, join and leave consumer groups via RPCs sent to the brokers using the kafka protocol.

这篇关于ZooKeeper在集群中的作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-11 08:12