说明
根据explain命令,有一个范围导致查询执行完整表扫描(160k行)。如何保持范围条件并减少扫描?我认为罪魁祸首是:

Y.YEAR BETWEEN 1900 AND 2009 AND

代码
下面是具有范围条件的代码(STATION_DISTRICT可能是多余的)。
SELECT
  COUNT(1) as MEASUREMENTS,
  AVG(D.AMOUNT) as AMOUNT,
  Y.YEAR as YEAR,
  MAKEDATE(Y.YEAR,1) as AMOUNT_DATE
FROM
  CITY C,
  STATION S,
  STATION_DISTRICT SD,
  YEAR_REF Y FORCE INDEX(YEAR_IDX),
  MONTH_REF M,
  DAILY D
WHERE
  -- For a specific city ...
  --
  C.ID = 10663 AND

  -- Find all the stations within a specific unit radius ...
  --
  6371.009 *
  SQRT(
    POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) +
    (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) *
     POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) ) <= 50 AND

  -- Get the station district identification for the matching station.
  --
  S.STATION_DISTRICT_ID = SD.ID AND

  -- Gather all known years for that station ...
  --
  Y.STATION_DISTRICT_ID = SD.ID AND

  -- The data before 1900 is shaky; insufficient after 2009.
  --
  Y.YEAR BETWEEN 1900 AND 2009 AND

  -- Filtered by all known months ...
  --
  M.YEAR_REF_ID = Y.ID AND

  -- Whittled down by category ...
  --
  M.CATEGORY_ID = '003' AND

  -- Into the valid daily climate data.
  --
  M.ID = D.MONTH_REF_ID AND
  D.DAILY_FLAG_ID <> 'M'
GROUP BY
  Y.YEAR

更新
SQL正在执行全表扫描,这将导致MySQL执行“复制到tmp表”,如下所示:
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+
|id |选择|类型|表格|类型|可能的|键|键|长度|参考|行|额外|
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+
|1 |简单| C |常量|主要|主要| 4 |常量| 1 ||
|1 |简单| Y |范围|年份|年份|年份| IDX | 4 |空| 160422 |使用|
|1 | SIMPLE | SD | eq | ref | PRIMARY | PRIMARY | 4 | climate.Y.STATION | DISTRICT | ID | 1 |使用索引|
|1 | SIMPLE | S | eq | ref | PRIMARY | PRIMARY | 4 | climate.SD.ID | 1 |使用where|
|1 | SIMPLE | M | ref | PRIMARY,YEAR | ref | u IDX,CATEGORY | YEAR | ref | u IDX | 8 | climate.Y.ID | 54 |使用where|
|1 | SIMPLE | D | ref | INDEX | INDEX | 8 | climate.M.ID | 11 |使用|
+----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+
回答
使用STRAIGHT_JOIN之后:
+----+-------------+-------+--------+-----------------------------------+---------------+---------+-------------------------------+------+---------------------------------+
|id |选择|类型|表格|类型|可能的|键|键|长度|参考|行|额外|
+----+-------------+-------+--------+-----------------------------------+---------------+---------+-------------------------------+------+---------------------------------+
|1 | SIMPLE | C | const | PRIMARY | PRIMARY | 4 | const | 1 |使用临时;使用文件排序|
|1 |简单| S |全部|主要|空|空| 7795 |使用|
|1 | SIMPLE | SD | eq | ref | PRIMARY | PRIMARY | 4 | climate.S.STATION | DISTRICT | ID | 1 |使用索引|
|1 | SIMPLE | Y | ref | PRIMARY,STAT | YEAR | IDX | STAT | YEAR | IDX | 4 | climate.S.STATION | DISTRICT | 1650 |使用where|
|1 | SIMPLE | M | ref | PRIMARY,YEAR | ref | u IDX,CATEGORY | YEAR | ref | u IDX | 8 | climate.Y.ID | 54 |使用where|
|1 | SIMPLE | D | ref | INDEX | INDEX | 8 | climate.M.ID | 11 |使用|
+----+-------------+-------+--------+-----------------------------------+---------------+---------+-------------------------------+------+---------------------------------+
相关的
http://dev.mysql.com/doc/refman/5.0/en/how-to-avoid-table-scan.html
http://dev.mysql.com/doc/refman/5.0/en/where-optimizations.html
Optimize SQL that uses between clause
谢谢您!

最佳答案

一个请求。。。看起来你知道你的数据。添加关键字“直线连接”并查看结果。。。
选择直接连接。。。剩下的问题。。。
直接连接告诉MySql按照我列出的那样做。所以,你的城市表是来自列表中的第一个,这表明你希望它是你的主要。。。另外,城市的WHERE子句是直接过滤器。尽管如此,它可能会在查询的其余部分中飞行。。。
希望有帮助。。。它为我提供了数百万条被查询记录的政府数据,并加入到mySql试图为我考虑的10多个查找表中。

关于sql - 消除了由于BETWEEN(和GROUP BY)引起的全表扫描,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/2815669/

10-16 16:20