本文介绍了如何处理fork-join-queue中的最终一致性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在寻找替换以下视频中描述的fork-join-queue的有缺陷的实现:

I'm currently looking to replace a flawed implementation of the fork-join-queue described in the following video:

https://youtu.be/zSDC_TU7rtc?t=33m37s

我意识到这段视频已经有将近八年的历史了,很高兴得知做这种事情的任何潜在的新的更好的方法,但是现在我正专注于尝试按照所述方式进行这项工作由布雷特.截至目前,我眼前的情况有点混乱.

I realize that this video is nearly eight years old now, and I would be very happy to learn of any potential new and better ways to do such things, but for now I'm focusing on trying to make this work as described by Brett. As of right now, what's in front of me is a bit of a mess.

原始开发人员与Brett所做的不同之处之一是,他将特定sum_name的所有工作项都放在一个实体组中.

One of the things the original developer did differently from Brett is that he puts all work items for a particular sum_name into a single entity group.

我对Datastore还是比较陌生,但是对我来说,这似乎违反了整个目标,因为每秒将新实体几次添加到实体组中会引起争用,这就是我们正在尝试的事情避免批量更改.

I'm still relatively new to Datastore, but to me it seems like that defeats the entire purpose, since adding a new entity to the entity group several times a second will cause contention, which is the very thing we're trying to avoid by batching changes.

对于为什么,有人会尝试将所有工作放在一个实体组中,原始开发人员的意见很明确-他试图防止由于最终的一致性而跳过工作项.这使我真正深入研究了Brett的实现,我感到非常困惑,因为这似乎是Brett并未考虑的问题.

As for why someone would try to put all the work in a single entity group, the original developer's comments are clear-- He's trying to prevent work items from getting skipped due to eventual consistency. This led me to really dig into Brett's implementation, and I'm very puzzled because it does seem like a problem Brett is not considering.

简单地说,当Brett的任务查询工作项时,它使用的索引可能不是最新的.可以肯定的是,他对内存缓存所做的锁定应该使其不太可能出现,因为任务的开始将阻止将更多的工作项添加到该索引中.但是,如果索引更新时间足够长,以至于在锁减小之前写入了一些内容,但是仍然没有返回到查询结果中,该怎么办?这样的工作项不会仅仅停留在数据存储区中而永不被使用吗?

Put simply, when Brett's task queries for work items, the index it is using may not be fully up to date. Sure, the lock he's doing with memcache should make this unlikely, since the start of the task will prevent more work items from being added to that index. However, what if the index update time is long enough such that something is written before the lock decrements, but still doesn't come back in the query results? Won't such a work item wind up just hanging out in the datastore, never to be consumed?

Brett的实现中是否存在一些我看不到的实现方式?显然布雷特知道他在做什么,并且对此很有信心,所以我觉得我一定很想念东西.

Is there some facet of Brett's implementation that deals with this that I'm not seeing? Obviously Brett knows what he is doing and was very confident in this, so I feel like I must be missing something.

如果不是,那么如何处理呢?

If not, though, how might one go about handling this?

推荐答案

基于对话的日期,对话假定为主/从数据存储区.演讲来自2010年,但复制数据存储区很高( https://googleappengine.blogspot.com/2011/01/announcing-high-replication-datastore.html )直到6个月后才发布.

Based on the date of the talk, the talk assumed the master/slave Datastore. The talk is from 2010, but high replication Datastore (https://googleappengine.blogspot.com/2011/01/announcing-high-replication-datastore.html) wasn't released until 6 months later.

解决实体组争用的一种方法是手动创建带有task-name-INDEX之类的工作项密钥,然后在任务中获取从task-name-0到task-name-TOP_INDEX的所有密钥.并且最高索引可能存储在内存缓存中.

One way around the entity group contention would be to manually create the work item key with something like task-name-INDEX and in the task do a get an all the keys from task-name-0 to task-name-TOP_INDEX and top index can probably be stored in memcache.

这篇关于如何处理fork-join-queue中的最终一致性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-24 07:48