AWS DynamoDB写后读一致性 - 它是如何工作的理论？

本文介绍了AWS DynamoDB写后读一致性 - 它是如何工作的理论？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

大多数的NoSQL解决方案只能使用最终的一致性，并考虑到DynamoDB复制数据分成三个数据中心，写一致性被维持后怎么看？

Most of the nosql solution only use eventually consistency, and given that DynamoDB replicate the data into three datacenter, how does read after write consistency is being maintained?

什么是通用的方法来这样的问题？我认为这是有趣的，因为即使是在MySQL的复制数据的异步复制。

What would be generic approach to this kind of problem? I think it is interesting since even in MySQL replication data is replicated asynchronously.

推荐答案

我将使用MySQL来说明答案，因为你提到它，但是，很明显，我们谁也不是暗示DynamoDB运行MySQL的。

I'll use MySQL to illustrate the answer, since you mentioned it, though, obviously, neither of us is implying that DynamoDB runs on MySQL.

在一个MySQL主和任意数量的奴隶的一个网络，答案似乎非常简单 - 为最终一致性，读取从随机选择的从站的答案;对于读后写的一致性，始终获取来自主的答案。

In a single network with one MySQL master and any number of slaves, the answer seems extremely straightforward -- for eventual consistency, fetch the answer from a randomly-selected slave; for read-after-write consistency, always fetch the answer from the master.

即使是在MySQL的复制数据的异步复制

还有一个重要的例外这种说法，我怀疑有一个很好的机会，它更接近DynamoDB的实际情况比这里的任何其他替代方案：在一个MySQL兼容的，高手之间的复制是同步的，因为主人对每一笔交易在提交时间并不能成为交易合作致力于为所有的大师也将在它起源于主抛出一个错误。像这样的簇技术上可以只用2个节点进行操作，但不应有少于3个，因为当在集群中的分裂，其单独或一组发现本身的任何节点小于一半的原始群集大小的将推出本身成一个无害的小球，并拒绝服务查询，因为它知道它是在一个孤立的少数，它的数据不再被信任。所以三是在这样的分布式环境中一个神奇的数字的东西，以避免灾难性裂脑情况。

There's an important exception to that statement, and I suspect there's a good chance that it's closer to the reality of DynamoDB than any other alternative here: In a MySQL-compatible Galera cluster, replication among the masters is synchronous, because the masters collaborate on each transaction at commit-time and a transaction that can't be committed to all of the masters will also throw an error on the master where it originated. A cluster like this technically can operate with only 2 nodes, but should not have less than three, because when there is a split in the cluster, any node that finds itself alone or in a group smaller than half of the original cluster size will roll itself up into a harmless little ball and refuse to service queries, because it knows it's in an isolated minority and its data can no longer be trusted. So three is something of a magic number in a distributed environment like this, to avoid a catastrophic split-brain condition.

如果我们假设在DynamoDB的 3地理分布的副本都大师的副本，它们可能会沿着同步大师像你这样的相同线路的逻辑操作会发现有加莱拉，因此该解决方案将是基本相同的，因为该安装程序还允许任何主人或全部仍然有传统的对向异步奴隶使用MySQL本地复制。所不同的还有，你可以获取任何当前连接到群集如果你想读后写一致性的主人，因为所有的人都在同步;从一个奴隶，否则取。

If we assume the "three geographically-distributed replicas" in DynamoDB are all "master" copies, they might operate with logic along same lines of synchronous masters like you'd find with Galera, so the solution would be essentially the same since that setup also allows any or all of the masters to still have conventional subtended asynchronous slaves using MySQL native replication. The difference there is that you could fetch from any of the masters that is currently connected to the cluster if you wanted read-after-write consistency, since all of them are in sync; otherwise fetch from a slave.

第三种情况我能想到的就类似于三个地理上分散的MySQL的主人在一个圆形的复制配置，这又支持对向奴隶关每一个高手，但有更多的问题，大师是不是同步的，不存在冲突分辨能力 - 不是在所有可行的本申请中，但为了讨论的目的，目标仍然可以实现，如果每一个对象进行了某种highly- precise时间戳。当读后写是必要的一致性，这里的解决方案可能是服务于响应轮询所有的主人找到了最新的版本，而不是返回应答系统，直到所有的高手都被轮询，或者从机读取数据对于最终一致性。

The third scenario I can think of would be analogous to three geographically-dispersed MySQL masters in a circular replication configuration, which, again, supports subtended slaves off of each master, but has the additional problems that the masters are not synchronous and there is no conflict resolution capability -- not at all viable for this application, but for purposes of discussion, the objective could still be achieved if each "object" had some kind of highly-precise timestamp. When read-after-write consistency is needed, the solution here might be for the system serving the response to poll all of the masters to find the newest version, not returning an answer until all masters had been polled, or to read from a slave for eventual consistency.

从本质上讲，如果有一个以上的写主人，那么它会像主人别无选择，只能要么在提交时进行合作，或一致的读出时间进行协作。

Essentially, if there's more than one "write master" then it would seem like the masters have no choice but to either collaborate at commit-time, or collaborate at consistent-read-time.

有趣的是，我认为，尽管有一些抱怨，你可以在网上舆论条关于悬殊的定价DynamoDB，这种分析的两个读一致性水平之间找到 - 即使从DynamoDB的内部，因为它是闭门造车是 - 似乎证明这种差别。

Interestingly, I think, in spite of some whining you can find in online opinion pieces about the disparity in pricing among the two read-consistency levels in DynamoDB, this analysis -- even as divorced from the reality of DynamoDB's internals as it is -- does seem to justify that discrepancy.

最后，一致的读取副本基本上是无限扩展（甚至与MySQL，其中主可以很容易地服务于多个从站，每个也可以很容易地服务于自己的几个奴隶，每一个都可以一举数... 循环往复的），但阅读后写是不是无限扩展的，因为根据定义，它似乎需要一个更权威的服务器，不管它具体是指的参与，因此有理由更高的价格读取在需要的一致性该水平

Eventually-consistent read replicas are essentially infinitely scalable (even with MySQL, where a master can easily serve several slaves, each of which can also easily serve several slaves of its own, each of which can serve several... ad infinitum) but read-after-write is not infinitely scalable, since by definition it would seem to require the involvement of a "more-authoritative" server, whatever that specifically means, thus justifying a higher price for reads where that level of consistency is required.

这篇关于AWS DynamoDB写后读一致性 - 它是如何工作的理论？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！