本文介绍了用InnoDB全文搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个大容量的web应用程序,其中一部分是讨论帖的MySQL数据库,需要顺利地增长到20M +行。

I'm developing a high-volume web application, where part of it is a MySQL database of discussion posts that will need to grow to 20M+ rows, smoothly.

我最初计划使用MyISAM来表格(对于内置的),但是由于单次写入操作导致整个表被锁住的想法让我快门。行级锁更有意义(更不用说InnoDB在处理大型表时的其他速度优势)。因此,我决定使用InnoDB。

I was originally planning on using MyISAM for the tables (for the built-in fulltext search capabilities), but the thought of the entire table being locked due to a single write operation makes me shutter. Row-level locks make so much more sense (not to mention InnoDB's other speed advantages when dealing with huge tables). So, for this reason, I'm pretty determined to use InnoDB.

问题是...... InnoDB没有内置的全文搜索功能。

The problem is... InnoDB doesn't have built-in fulltext search capabilities.

我应该使用第三方搜索系统吗?像 / ?你们有没有数据库忍者有任何建议/指导? ...是围绕实时功能而构建的(这对于我的应用程序来说非常重要)。我有点犹豫不决,没有一点洞察力就做出了承诺......

Should I go with a third-party search system? Like Lucene(c++) / Sphinx? Do any of you database ninjas have any suggestions/guidance? ... having been built around realtime capabilities (which is pretty critical for my application.) I'm a little hesitant to commit yet without some insight...

(仅供参考:将使用高内存平台的EC2,使用PHP来为前端服务)

(FYI: going to be on EC2 with high-memory rigs, using PHP to serve the frontend)

推荐答案

我可以保证对于MyISAM全文是一个不好的选择 - 即使抛开MyISAM表中的各种问题,我已经看到全文内容会消失,并开始破坏自己并定期崩溃MySQL。

I can vouch for MyISAM fulltext being a bad option - even leaving aside the various problems with MyISAM tables in general, I've seen the fulltext stuff go off the rails and start corrupting itself and crashing MySQL regularly.

专用搜索引擎绝对是最灵活的选择 - 将发布数据存储在MySQL / innodb中,然后将文本导出到您的搜索引擎。您可以非常轻松地设置定期的完整索引构建/发布,并且如果您觉得需要并希望花费时间,请添加实时索引更新。

A dedicated search engine is definitely going to be the most flexible option here - store the post data in MySQL/innodb, and then export the text to your search engine. You can set up a periodic full index build/publish pretty easily, and add real-time index updates if you feel the need and want to spend the time.

Lucene和狮身人面像是很好的选择,正如一样,它很好,很轻便。如果你走Lucene的路线,不要认为Clucene会更好,即使你不想与Java搏斗,尽管我没有真正的资格去讨论任何一个的优缺点。

Lucene and Sphinx are good options, as is Xapian, which is nice and lightweight. If you go the Lucene route don't assume that Clucene will better, even if you'd prefer not to wrestle with Java, although I'm not really qualified to discuss the pros and cons of either.

这篇关于用InnoDB全文搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-19 06:20