本文介绍了在MySQL中使用NHibernate guid.comb主键时如何避免碎片化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将数据库从Microsoft SQL Server迁移到MySQL/MariaDB.在MSSQL上,数据库对所有主键都使用uniqueidentifier(GUID)数据类型. NHibernate用于在数据库和应用程序之间映射数据,并且guid.comb策略用于GUID生成,以避免聚集索引的碎片化.

I am in the progress of migrating a database from Microsoft SQL Server to MySQL/MariaDB. On MSSQL, the database uses the uniqueidentifier (GUID) datatype for all primary keys. NHibernate is used to map data between database and application, and the guid.comb strategy is employed for GUID generation to avoid fragmentation of the clustered indexes.

MySQL没有专用的GUID数据类型,新的数据库模式将BINARY(16)用于所有标识符.在不对NHibernate映射进行任何更改的情况下,我可以启动我们的应用程序,保留新实体并将其从MySQL数据库加载回去.伟大的!但是,事实证明,在BINARY(16)列中,按顺序生成的GUID非常不按顺序排列,从而产生了不可接受的索引碎片.

MySQL not having a dedicated GUID data type, the new database schema uses BINARY(16) for all identifiers. Without making any changes to the NHibernate mappings, I can start our application, persist new entities and load them back up from the MySQL database. Great! However, it turns out that the sequantially generated GUIDs are ordered very much non-sequantially in the BINARY(16) column, yielding unacceptable index fragmentation.

仔细阅读该问题,结果发现 MSSQL具有一种非常特殊的方法排序GUID .这16个字节的顺序是最后6个字节,然后按相反的顺序排列,而我朴素的MySQL实现首先在第一个字节上排序,然后在下一个字节上排序,以此类推.

Reading up on the issue, it turns out that MSSQL has a quite special method for sorting GUIDs. The 16 bytes are ordered first by the last six bytes, then by the precending in reversed groups, whereas my naïve MySQL implementation sorts on the first byte first, then the next and so forward.

这导致了我的问题:如何在保留现有GUID和guid.comb策略的同时,避免在MySQL数据库中出现这种碎片?我自己有一个解决方案的想法(在下面发布),但我不禁感到自己可能错过了一些东西.当然,其他人一定已经解决了这个问题,也许有一种简单的方法可以解决它.

And this leads to my question: How to avoid this fragmentation in the MySQL database, while keeping the existing GUIDs and the guid.comb strategy? I have an idea for a solution myself (posted below) but I cannot help feeling I might have missed something. Surely, others must have dealt with this issue before, and maybe there is a simple way to get around it.

推荐答案

艾伯特·法拉利(Alberto Ferrari)和上讨论的算法为什么,Microsoft SQL Server通过按特定顺序比较字节来对GUID进行排序.由于MySQL将对BINARY(16)进行直接排序",因此我们要做的就是在读写数据库时对字节进行重新排序.

As observed by Alberto Ferrari and discussed here on StackOverflow, Microsoft SQL Server sorts GUIDs by comparing the bytes in a specific order. As MySQL will sort a BINARY(16) "straight-foward", all we need to do, is to reorder the bytes when reading/writing to the database.

NHibernate允许我们定义自定义数据类型,这些数据类型可用于数据库和对象之间的映射.我已经实现了BinaryGuidType,能够对 Guid.ToByteArray() 根据MSSQL排序GUID并将它们重新排序为 Guid(byte[]) 构造函数.

NHibernate allows us to define custom data types, which can be used in mappings between database and objects. I have implemented a BinaryGuidType, capable of reordering the bytes produced by Guid.ToByteArray() according to the way MSSQL sorts GUIDs and reordering them back into the format accepted by the Guid(byte[]) constructor.

字节顺序如下:

int[] ByteOrder = new[] { 10,11,12,13,14,15,8,9,6,7,4,5,0,1,2,3 };

System.Guid保存到BINARY(16)的过程如下:

var bytes = ((Guid) value).ToByteArray();
var reorderedBytes = new byte[16];

for (var i = 0; i < 16; i++)
{
    reorderedBytes[i] = bytes[ByteOrder[i]];
}

NHibernateUtil.Binary.NullSafeSet(cmd, reorderedBytes, index);

将字节读回到System.Guid就像这样:

Reading the bytes back into a System.Guid goes like this:

var bytes = (byte[]) NHibernateUtil.Binary.NullSafeGet(rs, names[0]);
if (bytes == null || bytes.Length == 0) return null;

var reorderedBytes = new byte[16];

for (var i = 0 ; i < 16; i++)
{
    reorderedBytes[ByteOrder[i]] = bytes[i];
}

此处BinaryGuidType的完整源代码.

Full source code for the BinaryGuidType here.

这似乎运作良好.在一个表中创建并持久存储10.000个新对象,它们完全按顺序存储,没有索引碎片的迹象.

This seems to work well. Creating and persisting 10.000 new objects in a table, they are stored completely sequentially, with no signs of index fragmentation.

这篇关于在MySQL中使用NHibernate guid.comb主键时如何避免碎片化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-22 12:22