本文介绍了在 VMWare 上的集群中启动 Cassandra 节点时,如何解决节点令牌冲突问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 VM 环境中向集群添加节点时,initial_token 冲突是否存在任何已知问题?

Are there any known issues with initial_token collision when adding nodes to a cluster in a VM environment?

我正在研究在 VM 上设置的 4 节点集群.当我们尝试向集群添加节点时遇到问题.

I'm working on a 4 node cluster set up on a VM. We're running into issues when we attempt to add nodes to the cluster.

在 cassandra.yaml 文件中,initial_token 留空.由于我们运行的是 > 1.0 cassandra,因此默认情况下 auto_bootstrap 应该为 true.

In the cassandra.yaml file, initial_token is left blank.Since we're running > 1.0 cassandra, auto_bootstrap should be true by default.

据我所知,集群中的每个节点都应该在启动时分配一个初始令牌.

It's my understanding that each of the nodes in the cluster should be assigned an initial token at startup.

这不是我们目前看到的.我们不想为每个节点手动设置 initial_token 的值(有点违背动态的目标......)我们也将分区器设置为随机:partitioner: org.apache.cassandra.dht.RandomPartitioner

This is not what we're currently seeing. We do not want to manually set the value for initial_token for each node (kind of defeats the goal of being dynamic..)We also have set the partitioner to random: partitioner: org.apache.cassandra.dht.RandomPartitioner

我在下面概述了我们遵循的步骤和结果.有人可以告诉我们我们在这里缺少什么吗?

I've outlined the steps we follow and results we are seeing below.Can someone please asdvise as to what we're missing here?

以下是我们正在采取的详细步骤:

Here are the detailed steps we are taking:

1) 杀死所有 cassandra 实例并删除数据 &在每个节点上提交日志文件.

1) Kill all cassandra instances and delete data & commit log files on each node.

启动良好.

Address         DC          Rack        Status State   Load            Effective-Ownership Token
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463

4) X.X.X.X 启动

 INFO [GossipStage:1] 2012-11-29 21:16:02,194 Gossiper.java (line 850) Node /X.X.X.X is now part of the cluster
 INFO [GossipStage:1] 2012-11-29 21:16:02,194 Gossiper.java (line 816) InetAddress /X.X.X.X is now UP
 INFO [GossipStage:1] 2012-11-29 21:16:02,195 StorageService.java (line 1138) Nodes /X.X.X.X and /Y.Y.Y.Y have the same token 113436792799830839333714191906879955254.  /X.X.X.X is the new owner
 WARN [GossipStage:1] 2012-11-29 21:16:02,195 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /Y.Y.Y.Y to /X.X.X.X

5) 运行 nodetool -h W.W.W.W ring 并查看:

Address         DC          Rack        Status State   Load            Effective-Ownership Token
                                                                                           113436792799830839333714191906879955254
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
W.W.W.W         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254

6) Y.Y.Y.Y 启动

 INFO [GossipStage:1] 2012-11-29 21:17:36,458 Gossiper.java (line 850) Node /Y.Y.Y.Y is now part of the cluster
 INFO [GossipStage:1] 2012-11-29 21:17:36,459 Gossiper.java (line 816) InetAddress /Y.Y.Y.Y is now UP
 INFO [GossipStage:1] 2012-11-29 21:17:36,459 StorageService.java (line 1138) Nodes /Y.Y.Y.Y and /X.X.X.X have the same token 113436792799830839333714191906879955254.  /Y.Y.Y.Y is the new owner
 WARN [GossipStage:1] 2012-11-29 21:17:36,459 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /X.X.X.X to /Y.Y.Y.Y

7) 运行 nodetool -h W.W.W.W ring 并查看:

Address         DC          Rack        Status State   Load            Effective-Ownership Token
                                                                                           113436792799830839333714191906879955254
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
Y.Y.Y.Y         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254

8) Z.Z.Z.Z 启动

 INFO [GossipStage:1] 2012-11-30 04:52:28,590 Gossiper.java (line 850) Node /Z.Z.Z.Z is now part of the cluster
 INFO [GossipStage:1] 2012-11-30 04:52:28,591 Gossiper.java (line 816) InetAddress /Z.Z.Z.Z is now UP
 INFO [GossipStage:1] 2012-11-30 04:52:28,591 StorageService.java (line 1138) Nodes /Z.Z.Z.Z and /Y.Y.Y.Y have the same token 113436792799830839333714191906879955254.  /Z.Z.Z.Z is the new owner
 WARN [GossipStage:1] 2012-11-30 04:52:28,592 TokenMetadata.java (line 160) Token 113436792799830839333714191906879955254 changing ownership from /Y.Y.Y.Y to /Z.Z.Z.Z

9) 运行 nodetool -h W.W.W.W ring 并查看:

Address         DC          Rack        Status State   Load            Effective-Ownership Token
                                                                                           113436792799830839333714191906879955254
W.W.W.W         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
S.S.S.S         datacenter1 rack1       Up     Normal  28.37 GB        100.00%             24360745721352799263907128727168388463
Z.Z.Z.Z         datacenter1 rack1       Up     Normal  123.87 KB       100.00%             113436792799830839333714191906879955254

提前致谢.

推荐答案

很明显,您的节点保留了一些在启动时正在使用的过去集群信息.确保删除包含集群数据的 LocationInfo 目录.您有一个非常奇怪的标记布局(例如,0 标记在哪里?),因此如果您想要正确的所有权,您肯定需要重新分配它们.

Clearly your nodes are holding onto some past cluster information that is being used at startup. Make sure to delete the LocationInfo directories, which contain the data about the cluster. You have a very strange token layout (where's the 0 token, for example?), so you're certainly going to need to reassign them if you want the proper ownership.

它可能有助于解释令牌分配的工作原理,所以让我也解决这个问题.在一个全新的集群中,默认情况下第一个节点将获得分配的令牌 0,并且拥有 100% 的所有权.如果您没有为下一个节点指定令牌,Cassandra 将计算一个令牌,使得原始节点拥有较低的 50%,而新节点拥有较高的 50%.

It may help to explain how token assignment works, so let me also address this. In a brand new cluster, the first node will get assigned token 0 by default and will have 100% ownership. If you do not specify a token for your next node, Cassandra will calculate a token such that the original node owns the lower 50% and the new node the higher 50%.

当您添加节点 3 时,它会在第一个节点和第二个节点之间插入令牌,因此您实际上最终会拥有看起来像 25%、25%、50% 的所有权.这真的很重要,因为这里要学习的教训是 Cassandra 永远不会自己重新分配令牌来平衡环.如果您希望正确平衡所有权,则必须分配自己的代币.这并不难做到,实际上有一个实用程序可以做到这一点.

When you add node 3, it will insert the token between the first and second, so you'll actually end up with ownership that looks like 25%, 25%, 50%. This is really important, because the lesson to learn here is that Cassandra will NEVER reassign a token by itself to balance the ring. If you want your ownership balanced properly, you must assign your own tokens. This is not hard to do, and there's actually a utility provided to do this.

因此,Cassandra 的初始引导过程虽然是动态的,但可能无法产生所需的环平衡.您不能简单地允许新节点在没有任何干预的情况下随意加入以确保获得所需的结果.否则,您最终会遇到您在问题中提出的场景.

So Cassandra's initial bootstrap process, while dynamic, may not yield the desired ring balance. You can't simply allow new nodes to join willy nilly without some intervention to make sure you get the desired result. Otherwise you will end up with the scenario you have laid out in your question.

这篇关于在 VMWare 上的集群中启动 Cassandra 节点时,如何解决节点令牌冲突问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-27 02:22