本文介绍了数据存储区模式下的Firestore:枚举属性值的索引热点与不良索引的索引热点是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到的症状表明,在查询许多其他实体共享的属性时,数据存储模式下的Cloud Firestore会变慢.看来这可能与效率低下的无索引查询(例如,此搜索需要复合索引)或索引热点有关(尽管我只能找到建议单调递增的值而不是少数枚举值的文档)

I'm experiencing symptoms which suggest that Cloud Firestore in Datastore mode can be slow when querying for properties that are shared by many other entities. It seems this may be related to an inefficient index-less query (e.g. I need a composite index for this search), or an index hotspot (though I can only find documentation recommending against monotonically increasing values, not a small number of enum values).

我的情况(简体)如下:

My situation (simplified) is as follows:

  • 我有1M个实体写入数据库(只有内置索引)
  • 所有实体均具有以下属性:prop1 = 'all'
  • 所有实体都有唯一的属性id in ['000000' - '999999'],并且另一个属性,id2=id
  • 所有实体的1/10(因此100k个实体)具有属性first_dig = '0'
  • I have 1M entities written to a database (with only the built-inindices)
  • All entities have the property: prop1 = 'all'
  • All entities have a unique property, id in ['000000' - '999999'], andanother property, id2=id
  • 1/10th of all entities (so 100k entities) have the properties first_dig = '0'

因此,有几种方法可以查询同一实体(在云控制台中使用GCL或通过Java API):

So, there are a couple ways I can query for the same entity (either using GCL in the cloud console or via the Java API):

  1. SELECT * FROM kind WHERE id = '000000'
  2. SELECT * FROM kind WHERE id = '000000' AND first_dig = '0'
  3. SELECT * FROM kind WHERE id = '000000' AND first_dig = '0' AND id2 = '000000'
  4. SELECT * FROM kind WHERE id = '000000' AND first_dig = '0' AND prop1 = 'all'
  1. SELECT * FROM kind WHERE id = '000000'
  2. SELECT * FROM kind WHERE id = '000000' AND first_dig = '0'
  3. SELECT * FROM kind WHERE id = '000000' AND first_dig = '0' AND id2 = '000000'
  4. SELECT * FROM kind WHERE id = '000000' AND first_dig = '0' AND prop1 = 'all'

我发现查询#1花费5秒,#2花费15秒,#3花费15秒,而#4花费约50秒. #4比#2慢得多,但#3却不比#2慢,这一事实使我认为搜索prop1='all'时存在索引热点(所有索引条目可能在同一平板电脑上),但是却没有用于id2='000000'.

I find that query #1 takes 5 seconds, #2 takes 15 seconds, #3 takes 15 seconds, and #4 takes ~50 seconds. The fact that #4 is much slower than #2, but #3 is not slower than #2 makes me think that there is index hotspotting when searching for prop1='all' (for which all index entries might be on the same tablet) but not for id2='000000'.

我的问题是:

  1. 是什么原因导致这里的速度下降?有什么我想念的吗?
  2. 是否有建议的做法来查询唯一性较低的索引属性?

谢谢!

请注意,此内容已交叉发布到 https://groups .google.com/forum/#!topic/google-appengine/91jCVQXZ6tI ,但这似乎是一个更合适的地方.

Note, this was cross-posted to https://groups.google.com/forum/#!topic/google-appengine/91jCVQXZ6tI, but this seems like a more appropriate place.

推荐答案

在没有复合索引的情况下,此查询执行的是锯齿形合并联接,这意味着每个AND操作要做的工作更多,而带有特定属性值表示需要过滤的实体越多.

Without a composite index this query is doing a zig-zag merge join, which means there is more work to do for each AND operation and the more entities with a specific property value the more entities that need to be filtered.

即您从为什么是我的Cloud Firestore查询速度慢?" .

关于热点,显示为较慢的写入,而不是较慢的查询.

As for hotspotting, that shows up as slower writes, not slower queries.

这篇关于数据存储区模式下的Firestore:枚举属性值的索引热点与不良索引的索引热点是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-31 05:30