本文介绍了如何在postgresql 8.3.7中的分区表上使用索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这种情况,运行一个按分区表中的索引列过滤的查询,执行全表扫描。

I have situation, where running a query that filters by an indexed column in a partitioned table, performs a full table scan.

显然,这是一个已知问题在postgresql中,详细解释了。

Apparently , this is a known issue in postgresql, and it's explained in detail here.

除了对每个分区执行查询,然后对所有结果执行UNION之外,还有更优雅的方法吗?

Is there a more elegant way around this other than performing a query on each partition, and then performing a UNION on all of the results?

推荐答案

索引工作得很好,只能对PostgreSQL中的相关分区进行扫描。但是,你必须正确设置一切才能正常工作,很容易错过

Indexes work just fine to do a scan only of the relevant partitions in PostgreSQL. But, you have to set everything up properly for it to work, and it's easy to miss a step in the long list of things documented at http://www.postgresql.org/docs/current/static/ddl-partitioning.html

要实现的主要事情是,为了避免顺序扫描,您必须向PostgreSQL提供足够的信息,以便它可以证明某些分区不能拥有您正在寻找的数据;然后它们被跳过作为查询结果的潜在来源。您链接的文章指出这是seq扫描问题的解决方案:如果您将范围约束添加到每个分区的日期字段,此查询可以优化为循环,您首先查询最新分区并工作向后,直到找到一个高于所有剩余分区范围的值。 - 但不会显示您在更改后看到的改进计划。

The main thing to realize is that in order to avoid a sequential scan, you have to provide enough information to PostgreSQL so it can prove some partitions cannot have the data you're looking for; then they are skipped as potential sources for the query results. The article you link to points this out as a solution to the seq scan problem: "If you add range constraints to the date field of each partition, this query can be optimized into a loop where you query the "latest" partition first and work backwards until you find a single value that is higher than the range of all the remaining partitions."--but doesn't show the improved plan you'd see after that change.

您可能犯过的一些常见错误:

Some common mistakes you might have made:

- 默认情况下,postgresql.conf文件中的constraint_exclusion参数处于关闭状态。使用该默认设置,您将得不到预期的结果。

-The constraint_exclusion parameter in the postgresql.conf file is off by default. With that default, you won't get what you expect.

- 不能使用CHECK创建非重叠分区,这使得规划人员无法知道每个内部的内容他们。可能会错过这一步,但仍然可以将数据正确地分配到正确的分区中,计划员就不会知道这一点。

-Didn't create non-overlapping partitions using CHECK, which keeps the planner from knowing what's inside each of them. It's possible to miss this step but still get your data into the right partitions properly, the planner just won't know that.

- 不要在每个分区上放置索引,只在主表上创建一个。这将使您在相关​​分区上进行顺序扫描,因此不会像上面那样糟糕但也不好。

-Did not put an index on each partition, only created one on the master table. This will give you a sequential scan just on the relevant partition, so not as bad as the above but not good either.

有一些工作可以让这一切变得更加容易PostgreSQL版本(在8.4中设置constraint_partition是相当自动的,并且正在进行某种分区设置自动化)。现在,如果您仔细按照说明操作并避免所有这些问题,它应该可以正常工作。

There's some work to make this all easier in upcoming PostgreSQL releases (setting constraint_partition is fairly automatic in 8.4 and some sort of partition setup automation is being worked in). Right now, if you follow the instructions carefully and avoid all these problems, it should work.

这篇关于如何在postgresql 8.3.7中的分区表上使用索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-23 07:01