梅克尔解释为树木在最终一致性使用

梅克尔解释为树木在最终一致性使用

本文介绍了梅克尔解释为树木在最终一致性使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

的被用作在多个分布式,复制键/值存储抗熵机制梅克尔树木:

毫无疑问,反熵机制是一件好事 - 瞬态故障只是发生在生产。我只是不知道我理解为什么梅克尔的的是流行的做法。

  • 发送完整的梅​​克尔树同行包括发送本地密钥空间给同行,一起每个键值,在树中存储的最低水平的哈希

  • 进行比较的来自同行需要拥有自己的梅克尔树发出了梅克尔树。

由于这两个同龄人必须已经在手的排序键/值哈希空间,为什么不这样做的线性合并检测不符?

我只是不相信树结构提供任何形式的储蓄,当你在保养成本因素,而事实是的线性经过树上的叶子已经被做到了序列化再presentation通过线路的。

要地了这一点,用吸管人的替代可能是有散列摘要的节点交换阵列,这是增量更新和模环的位置,分时段的。

我在想什么?

解决方案

梅克尔树限制同步时,传输的数据量。一般的假设是:

  1. 在网络I / O比本地I / O +计算哈希值越贵。
  2. 传输整个排序的密钥空间比逐步限制在几个步骤比较更贵。
  3. 在关键的空间比相似的差异较少。

一个梅克尔树交换是这样的:

  1. 先从树(一个哈希值的列表)的根。
  2. 的起源发送哈希值的列表在目前的水平。
  3. 目标DIFFS对付自己,然后哈希值的列表 请求子树是不同的。如果没有 差异,要求能终止。
  4. 重复步骤2和3,直到叶节点为止。
  5. 的起源将在结果集中的键的值。

在典型情况下,同步密钥空间的复杂性将是日志(N)。是的,在极端情况下,那里有共同的无按键,操作将相当于发送哈希值,O(N)的整个排序列表。人们可以分期偿还建设梅克尔树木动态生成它们作为写进来,保持序列化形式在磁盘上的代价。

我无法迪纳摩或卡桑德拉如何使用梅克尔树说话,而是将其用于群集内同步Riak停止(暗示切换和读修复足以在大多数情况下)。我们计划将它们重新添加一些内部的建筑位改变后后。

有关Riak的更多信息,我们鼓励您加入邮件列表:http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Merkle Trees are used as an anti-entropy mechanism in several distributed, replicated key/value stores:

No doubt an anti-entropy mechanism is A Good Thing - transient failures just happen, in production.I'm just not sure I understand why Merkle Trees are the popular approach.

  • Sending a complete Merkle tree to a peer involves sending the local key-space to that peer, along withhashes of each key value, stored in the lowest levels of the tree.

  • Diffing a Merkle tree sent from a peer requires having a Merkle tree of your own.

Since both peers must already have a sorted key / value-hash space on hand, why not do a linear merge to detect discrepancies?

I'm just not convinced that the tree structure provides any kind of savings when you factor in upkeep costs, and the factthat linear passes over the tree leaves are already being done just to serialize the representation over the wire.

To ground this out, a straw-man alternative might be to have nodes exchange arrays of hash digests,which are incrementally updated and bucketed by modulo ring-position.

What am I missing?

解决方案

Merkle trees limit the amount of data transferred when synchronizing. The general assumptions are:

  1. Network I/O is more expensive than local I/O + computing the hashes.
  2. Transferring the entire sorted key space is more expensive than progressively limiting the comparison over several steps.
  3. The key spaces have fewer discrepancies than similarities.

A Merkle Tree exchange would look like this:

  1. Start with the root of the tree (a list of one hash value).
  2. The origin sends the list of hashes at the current level.
  3. The destination diffs the list of hashes against its own and then requests subtrees that are different. If there are no differences, the request can terminate.
  4. Repeat steps 2 and 3 until leaf nodes are reached.
  5. The origin sends the values of the keys in the resulting set.

In the typical case, the complexity of synchronizing the key spaces will be log(N). Yes, at the extreme, where there are no keys in common, the operation will be equivalent to sending the entire sorted list of hashes, O(N). One could amortize the expense of building Merkle trees by building them dynamically as writes come in and keeping the serialized form on disk.

I can't speak to how Dynamo or Cassandra use Merkle trees, but Riak stopped using them for intra-cluster synchronization (hinted handoff and read-repair are sufficient in most cases). We have plans to add them back later after some internal architectural bits have changed.

For more information about Riak, we encourage you to join the mailing list: http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

这篇关于梅克尔解释为树木在最终一致性使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 05:35