问题描述
为了简单起见,假设我有一个包含 2 个字段的表 - PERSON_NAME 和 COMPANY_ID 与这些对:
To make it simple, let say I have a table with 2 fields - PERSON_NAME and COMPANY_ID with those pairs:
('a',1) ('b',1) ('c',2) ('d',2) ('e',3)
PERSON_NAME a 和 b 在 COMPANY_ID 1 等工作.
PERSON_NAME a and b are working in COMPANY_ID 1 and so on.
我想选择前 2 个(或 N 个)公司的所有行 -
I would like to select all rows for the first 2 (or N) COMPANIES -
('a',1) ('b',1) ('c',2) ('d',2)
我不知道每个 COMPANY_ID 有多少行.
I don't know how many rows there are for each COMPANY_ID.
GROUP BY 在这里不起作用,因为我需要每个 COMPANY_ID 中的所有行.
这个有效-
GROUP BY won't work here as I need all rows within each COMPANY_ID.
This one works -
SELECT * FROM T
WHERE COMPANY_ID in (SELECT DISTINCT COMPANY_ID
FROM T
ORDER BY 1 DESC
LIMIT N)
但由于性能是一个问题,这还不够(大约有 25k 家公司,每家公司大约有 5-15 人).N 通常为 500.
But since performance is an issue, it won't suffice (there are about 25k companies and about 5-15 persons in each company). N is usually 500.
推荐答案
改为以连接方式执行此操作:
Do this as a join instead:
SELECT T.*
FROM T join
(SELECT DISTINCT COMPANY_ID
FROM T
ORDER BY 1 DESC
LIMIT N
) TN
on TN.COMPANY_ID = T.COMPANY_ID;
显式连接通常比 where
子句中的 in
表现更好.
Explicit joins often perform better than in
in the where
clause.
这篇关于使用 mysql 选择前 N 个 *groups*的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!