本文介绍了当where语句中的日期更改时,MySQL EXPLAIN的“类型"从“范围"更改为“参考"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在测试各种想法,以优化我们工作中的系统中的某些表.今天,我遇到了一张桌子,该桌子跟踪我们系统中每辆车的每个视图.在下面创建表格.

I've been testing out different ideas for optimizing some of the tables we have in our system at work. Today I came across a table that tracks every view on each vehicle in our system. Create table below.

SHOW CREATE TABLE vehicle_view_tracking;

CREATE TABLE `vehicle_view_tracking` (
  `vehicle_view_tracking_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `public_key` varchar(45) NOT NULL,
  `vehicle_id` int(10) unsigned NOT NULL,
  `landing_url` longtext NOT NULL,
  `landing_port` int(11) NOT NULL,
  `http_referrer` longtext,
  `created_on` datetime NOT NULL,
  `created_on_date` date NOT NULL,
  `server_host` longtext,
  `server_uri` longtext,
  `referrer_host` longtext,
  `referrer_uri` longtext,
  PRIMARY KEY (`vehicle_view_tracking_id`),
  KEY `vehicleViewTrackingKeyCreatedIndex` (`public_key`,`created_on_date`),
  KEY `vehicleViewTrackingKeyIndex` (`public_key`)
) ENGINE=InnoDB AUTO_INCREMENT=363439 DEFAULT CHARSET=latin1;

我正在研究多列和单列索引.我运行了以下查询:

I was playing around with multi-column and single column indexes. I ran the following query:

EXPLAIN EXTENDED SELECT dealership_vehicles.vehicle_make, dealership_vehicles.vehicle_model, vehicle_view_tracking.referrer_host, count(*) AS count
FROM vehicle_view_tracking
LEFT JOIN dealership_vehicles
ON dealership_vehicles.dealership_vehicle_id = vehicle_view_tracking.vehicle_id
WHERE vehicle_view_tracking.created_on_date >= '2011-09-07' AND vehicle_view_tracking.public_key IN ('ab12c3')
GROUP BY (dealership_vehicles.vehicle_make) ASC , dealership_vehicles.vehicle_model, referrer_host

+----+-------------+-----------------------+--------+----------------------------------------------------------------+------------------------------------+---------+----------------------------------------------+-------+----------+----------------------------------------------+
| id | select_type | table                 | type   | possible_keys                                                  | key                                | key_len | ref                                          | rows  | filtered | Extra                                        |
+----+-------------+-----------------------+--------+----------------------------------------------------------------+------------------------------------+---------+----------------------------------------------+-------+----------+----------------------------------------------+
|  1 | SIMPLE      | vehicle_view_tracking | range  | vehicleViewTrackingKeyCreatedIndex,vehicleViewTrackingKeyIndex | vehicleViewTrackingKeyCreatedIndex | 50      | NULL                                         | 23086 |   100.00 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | dealership_vehicles   | eq_ref | PRIMARY                                                        | PRIMARY                            | 8       | vehicle_view_tracking.vehicle_id |     1 |   100.00 |                                              |
+----+-------------+-----------------------+--------+----------------------------------------------------------------+------------------------------------+---------+----------------------------------------------+-------+----------+----------------------------------------------+

(实际选择查询的执行时间为.309秒)

(Execution time for actual select query was .309 seconds)

然后我将where子句中的日期从'2011-09-07'更改为'2011-07-07',并获得以下解释结果

then I change the date in the where clause from '2011-09-07' to '2011-07-07' and got the following explain results

EXPLAIN EXTENDED SELECT dealership_vehicles.vehicle_make, dealership_vehicles.vehicle_model, vehicle_view_tracking.referrer_host, count(*) AS count
FROM vehicle_view_tracking
LEFT JOIN dealership_vehicles
ON dealership_vehicles.dealership_vehicle_id = vehicle_view_tracking.vehicle_id
WHERE vehicle_view_tracking.created_on_date >= '2011-07-07' AND vehicle_view_tracking.public_key IN ('ab12c3')
GROUP BY (dealership_vehicles.vehicle_make) ASC , dealership_vehicles.vehicle_model, referrer_host


+----+-------------+-----------------------+--------+----------------------------------------------------------------+-----------------------------+---------+----------------------------------------------+-------+----------+----------------------------------------------+
| id | select_type | table                 | type   | possible_keys                                                  | key                         | key_len | ref                                          | rows  | filtered | Extra                                        |
+----+-------------+-----------------------+--------+----------------------------------------------------------------+-----------------------------+---------+----------------------------------------------+-------+----------+----------------------------------------------+
|  1 | SIMPLE      | vehicle_view_tracking | ref    | vehicleViewTrackingKeyCreatedIndex,vehicleViewTrackingKeyIndex | vehicleViewTrackingKeyIndex | 47      | const                                        | 53676 |   100.00 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | dealership_vehicles   | eq_ref | PRIMARY                                                        | PRIMARY                     | 8       | vehicle_view_tracking.vehicle_id |     1 |   100.00 |                                              |
+----+-------------+-----------------------+--------+----------------------------------------------------------------+-----------------------------+---------+----------------------------------------------+-------+----------+----------------------------------------------+

(实际选择查询的执行时间为.670秒)

(Execution time for actual select query was .670 seconds)

我看到4个主要变化:

  1. 类型从范围更改为参考
  2. 键从vehicleViewTrackingKeyCreatedIndex更改为vehicleViewTrackingKeyIndex
  3. key_len从50更改为47(由密钥更改引起)
  4. 行数从23086更改为53676(由于密钥更改而导致)

这时,慢速查询的执行时间仅为0.6秒,但是数据库中只有大约10%的车辆.

At this point, the execution time is only .6 seconds for the slow query however we only have about 10% of our vehicles in our database.

天色已经晚了,我可能已经忽略了mysql文档中的某些内容,但是我似乎无法找到为什么在where子句中更改日期时键(以及类型和行)也会改变的原因.

It's getting late and I may have overlooked something in the mysql docs but I can't seem to find why the key (and in turn the type and rows) are changing when the date is changed in the where clause.

我们非常感谢您的帮助.我搜索了一个具有相同/相似问题的人,并且其日期导致了此更改,但找不到任何东西.如果我错过了上一篇文章,请给我链接:-)

The help is greatly appreciated. I searched for someone having the same/similar issue with a date causing this change and was not able to find anything. If I missed a previous post, please link me :-)

推荐答案

不同的搜索策略对不同的数据有意义.特别是,索引扫描(例如范围)通常必须进行一次搜索才能实际读取该行.在某些时候,进行所有这些搜索比根本不使用索引要慢.

Different search strategies make sense for different data. In particular, index scans (such as range) often have to do a seek to actually read the row. At some point, doing all those seeks is slower than not using the index at all.

以一个简单的示例为例,该表具有三列:id(主键),名称(索引),生日.说它有很多数据.如果您要求MySQL查找Bob的生日,它可以很快完成:首先,它在名称索引中找到Bob(这需要几次查找,log(n),其中n是行数),然后再进行一次查找读取数据文件中的实际行,并从中读取生日.这非常快,而且比扫描整个表格要快得多.

Take a trivial example, a table with three columns: id (primary key), name (indexed), birthday. Say it has a lot of data. If you ask MySQL to look for Bob's birthday, it can do that fairly quickly: first, it finds Bob in the name index (this takes a few seeks, log(n) where n is the row count), then one additional seek to read the actual row in the data file and read the birthday from it. That's very quick, and far quicker than scanning the entire table.

接下来,考虑做一个name like 'Z%'.那可能只是表格的一小部分.因此,找到Z在名称索引中的起始位置还是更快,然后对于每个Z来说,都寻求数据文件以读取该行. (这是一个范围扫描).

Next, consider doing a name like 'Z%'. That is probably a fairly small portion of the table. So its still faster to find where the Zs start in the name index, then for each one seek the data file to read the row. (This is a range scan).

最后,考虑询问以M-Z开头的所有名称.那大概是数据的一半.它可以先进行范围扫描,然后进行一次 lot 次搜索,但是以最终目标是读取一半行的方式在数据文件中随机搜索并不是最佳选择:只需执行一次大顺序读取数据文件.因此,在这种情况下,索引将被忽略.

Finally, consider asking for all names starting with M-Z. That's probably around half the data. It could do a range scan, and then a lot of seeks, but seeking randomly over the datafile with the ultimate goal of reading half the rows isn't optimal: it'd be faster to just do a big sequential read over the data file. So, in this case, the index will be ignored.

这就是您所看到的-除了您的情况以外,还有另一个键可以依靠. (如果没有日期索引,也可能实际上使用了日期索引,它应该选择最快的那个索引.请注意,MySQL的优化器经常在此出错.)

This is what you're seeing—except in your case, there is another key it can fall back on. (Its also possible that it might actually use the date index if it didn't have the other, it should pick whichever index will be quickest. Beware that MySQL's optimizer often makes errors in this.)

因此,简而言之,这是可以预期的.查询不会说如何检索数据,而是说检索什么数据.数据库的优化器应该找到最快的方法来检索它.

So, in short, this is expected. A query doesn't say how to retrieve the data, rather it says what data to retrieve. The database's optimizer is supposed to find the quickest way to retrieve it.

在两种情况下,您都可以在两列上找到索引,在这两种情况下,按优先顺序(public_key,created_on_date),可以加快查询速度.这是因为MySQL只能对每个表(每个查询)使用一个索引.另外,日期是在结尾处,因为只能在索引的最后一列上有效地进行范围扫描.

You may find an index on both columns, in the order (public_key,created_on_date) is preferred in both cases, and speeds up your query. This is because MySQL can only ever use one index per table (per query). Also, the date goes at the end because a range scan can only be done efficiently on the last column in an index.

[我相信,InnoDB实际上还有另一层间接性,但这只是混淆了这一点.这与说明没有区别.]

[InnoDB actually has another layer of indirection, I believe, but it'd just confuse the point. It doesn't make a difference to the explanation.]

这篇关于当where语句中的日期更改时,MySQL EXPLAIN的“类型"从“范围"更改为“参考"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-02 14:04