本文介绍了即使整个数据都在缓冲池中,为什么 InnoDB 在全表扫描时如此缓慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  • AMD 锐龙 9 5950X,128GB 3200MHz 双通道
  • 具有 3GB/s+ 读写速度的数据中心 NVMe SSD
  • MariaDB 10.6.3 x64
  • Windows Server 2019(Debian 上同样的问题)
  • 专用机器,没有其他任务在运行
[mysqld]
default-storage-engine=INNODB
log-output=NONE
general-log=0
general_log_file="mariadb.log"
slow-query-log=0
query_cache_type=OFF
query_cache_size=0
innodb_buffer_pool_size=64G

DDL

CREATE TABLE testinnodb
(
    a INTEGER NOT NULL, b INTEGER NOT NULL, c INTEGER NOT NULL,
    i FLOAT NOT NULL, j FLOAT NOT NULL, k FLOAT NOT NULL,
    x CHAR(20) NOT NULL, y CHAR(20) NOT NULL, z CHAR(20) NOT NULL
) ENGINE=InnoDB;

MyISAM 和 Memory 的架构相同.

Same schema for MyISAM and Memory.

表中填充了 10M 行随机数据,产生的数据大小:
InnoDB: 1.0 GB
MyISAM: 810 MB
内存: 867 MB

Tables are filled with 10M rows of random data, resulting data sizes:
InnoDB: 1.0 GB
MyISAM: 810 MB
Memory: 867 MB

SELECT * FROM testinnodb WHERE c=1;
SELECT * FROM testmyisam WHERE c=1;
SELECT * FROM testmemory WHERE c=1;

InnoDB: 2.4s !!!
MyISAM: 0.3 秒
内存: 0.2s

InnoDB: 2.4s !!!
MyISAM: 0.3s
Memory: 0.2s

查询运行多次,但性能保持不变.EXPLAIN 为所有三个查询(简单、使用 WHERE)提供相同的输出.

The queries are run multiple times but performance stays the same. EXPLAIN gives the same output for all three queries (SIMPLE, USING WHERE).

这显然不是 I/O 问题,考虑到 MyISAM 和 Memory 的硬件和性能对比.

This is clearly not an I/O issue, given the hardware and the performance of MyISAM and Memory in comparison.

64GB 的缓冲池也足以在内存中保存所有表.
数据必须在缓冲池中,因为禁用 innodb_buffer_pool_load_at_startup,第一次运行查询需要 4.2s,后续运行需要 2.4s.
innodb_buffer_pool_bytes_data 在第一次运行后会增长超过 1GB,所以看起来整个数据实际上都在缓冲池中.
innodb_buffer_pool_read_requests 在每次执行时确实增加了大约 10M.

64GB for the buffer pool is also more than enough to hold all of that table in memory.
The data must be in the buffer pool because disabling innodb_buffer_pool_load_at_startup, the query will take 4.2s on first run, and then 2.4s in subsequent runs.
innodb_buffer_pool_bytes_data will have grown by over 1GB after the first run, so it looks like the entire data is in fact in the buffer pool.
innodb_buffer_pool_read_requests does increase by about 10M on each execution.

我需要帮助了解发生了什么.这肯定不对吧?我尝试过使用数据库配置(例如 innodb_old_blocks_time=0innodb_read_io_threads=32innodb_write_io_threads=32)但这实际上并没有改变.

I need help understanding what is going on. Surely this can't be right? I've tried playing around with the DB config (e.g. innodb_old_blocks_time=0, innodb_read_io_threads=32 and innodb_write_io_threads=32) but this literally changed nothing.

我知道使用 INDEX 会有所改善,但这不是重点.

如果您需要一些用于调试的状态变量,请告诉我,我是 InnoDB 的新手,所以我不确定在这里发布什么相关.

Let me know if you need some status variables for debugging, I'm new to InnoDB so I'm not sure what was relevant to post here.

=====================================
Per second averages calculated from the last 50 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 0 srv_active, 0 srv_shutdown, 50 srv_idle
srv_master_thread log flush and writes: 50
----------
SEMAPHORES
----------
------------
TRANSACTIONS
------------
Trx id counter 20001586
Purge done for trx's n:o < 20001583 undo n:o < 0 state: running
History list length 14
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION (000002727BD64108), not started
0 lock struct(s), heap size 1128, 0 row lock(s)
--------
FILE I/O
--------
Pending flushes (fsync) log: 0; buffer pool: 0
71726 OS file reads, 2 OS file writes, 2 OS fsyncs
0.00 reads/s, 16413 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 0, seg size 2, 0 merges
merged operations:
 insert 0, delete mark 0, delete 0
discarded operations:
 insert 0, delete mark 0, delete 0
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 8881303100
Log flushed up to   8881303100
Pages flushed up to 8881303100
Last checkpoint at  8881303088
0 pending log flushes, 0 pending chkp writes
4 log i/o's done, 0.08 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 68753031168
Dictionary memory allocated 424846000
Buffer pool size   4147712
Free buffers       4075870
Database pages     71842
Old database pages 26539
Modified db pages  0
Percent of dirty pages(LRU & free pages): 0.000
Max dirty pages percent: 90.000
Pending reads 0
Pending writes: LRU 0, flush list 0
Pages made young 0, not young 0
0.00 youngs/s, 0.00 non-youngs/s
Pages read 71711, created 131, written 0
1434.19 reads/s, 2.62 creates/s, 0.00 writes/s
Buffer pool hit rate 996 / 1000, young-making rate 0 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 71842, unzip_LRU len: 0
I/O sum[0]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 read views open inside InnoDB
Process ID=0, Main thread ID=0, state: sleeping
Number of rows inserted 0, updated 0, deleted 0, read 20000000
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 399992.00 reads/s
Number of system rows inserted 0, updated 0, deleted 0, read 0
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================

推荐答案

InnoDB 和 MySQL 或 MariaDB 服务器的 SQL 解释器之间的接口有相当多的开销.

There is quite a bit of overhead in the interface between InnoDB and the SQL interpreter of the MySQL or MariaDB server.

在 InnoDB 中,每次访问都必须由缓冲池页锁存器保护.小型交易对象将跟踪获取的锁存器.基本上,对于每个获取的行,InnoDB 都会启动一个小事务,在缓冲池中查找 B-tree 叶页,获取页锁存器,复制数据,最后提交小事务并释放页锁存器.

In InnoDB, each access must be protected by a buffer pool page latch. A mini-transaction object will keep track of the acquired latches. Basically, for every fetched row, InnoDB will start a mini-transaction, look up the B-tree leaf page in the buffer pool, acquire the page latch, copy the data, and finally commit the mini-transaction and release the page latch.

在此之上还有一些优化,但这还不够,最好实现 MDEV-16232 允许小事务在整个范围扫描中持续存在.这样,我们只有在前进到下一页时才会获取和释放页面锁存器.

There are a couple of optimizations on top of this, but this is insufficient, and it would be better to implement MDEV-16232 to allow a mini-transaction to persist across the entire range scan. In that way, we would only acquire and release page latches when advancing to the next page.

在范围扫描中,永久光标 (btr_pcur_t) 将存储当前位置.当光标位置在下一个小事务开始时恢复(以获取下一条记录)时,将尝试 乐观恢复,假设指向缓冲池页面的旧指针是仍然有效.

In range scans, a persistent cursor (btr_pcur_t) will store the current position. When the cursor position is restored at the start of the next mini-transaction (to fetch the next record), an optimistic restore will be attempted, with the assumption that the old pointer to the buffer pool page is still valid.

InnoDB 还实现了一个预取缓冲区.在 4 次下一条记录读取操作之后,InnoDB 将在单个小型事务中一次将 8 条记录复制到缓冲区.随后将从此缓冲区满足后续请求.MDEV-16232 将使该机制变得多余,并应作为实施的一部分删除它.

InnoDB also implements a prefetch buffer. After 4 next-record read operations, InnoDB will copy 8 records at a time to the buffer, within a single mini-transaction. Subsequent requests will then be satisfied from this buffer. This mechanism would be made redundant by MDEV-16232 and should be removed as part of implementing it.

实施 MDEV-16232 还可以通过删除需要获取显式记录锁.如果我们在删除或更新一行的整个过程中持续持有页锁存器,那么只要不存在冲突,我们就可以依赖隐式锁定,就像我们在 INSERT 情况下所做的那样.

Implementing MDEV-16232 would also speed up UPDATE and DELETE operations, by removing the need to acquire explicit record locks. If we continuously hold the page latch for the whole duration of deleting or updating a row, we can rely on implicit locking whenever no conflicts exist, just like we do in the INSERT case.

这篇关于即使整个数据都在缓冲池中,为什么 InnoDB 在全表扫描时如此缓慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-19 04:56