本文介绍了全文检索数据库和200M +记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想要创建一个巨大的数据库,至少有2亿条目。
数据库需要使用全文搜索,并且应该很快。



我的数据库从许多不同的数据源获取数据,我需要定期导入新的或更新的数据。



将所有数据存储在像mysql这样的关系数据库中是一个好主意,然后创建一个nosql文档数据库(例如mongodb或弹性搜索),只是为了搜索,或者没有提供任何利益在
可靠性并且防止冗余信息?

解决方案

我相信将主记录保留在SQL数据库中并将其复制到noSQL数据库一个很常见的方法



ElasticSearch有一个关于他们。即使在最新版本中,ElasticSearch也可能会在。 ElasticSearch索引的结构发生重大变化(例如添加分析器)需要您所有文档。如果您有其他文件来源,此过程更安全。在一天结束的时候,ElasticSearch不是一直保存文档的 - 我只会在偶尔的数据丢失不是灾难的情况下选择使用ElasticSearch作为主要商店。



与ElasticSearch不同,MongoDB旨在是分发的。



我们正在运行ElasticSearch和MySQL,其优点远远超过了额外的基础设施的麻烦和处理两者之间的复制。我们以前曾尝试使用noSQL解决方案作为主数据存储区,并带来灾难性的后果。与MySQL一起运行ES可以让您成为两个世界的最佳 - 一致性和SQL中数据的安全性,可扩展,有效的全文搜索ES。


Iam about to create a huge database with at least 200 Million entries.The database needs to be searchable using full text and should be fast.

My database gets data from many different datasources and i need to import the new or updated data regularly.

Is it a good idea to store all my data in a relational database like mysql and then create a nosql document database (e.g. mongodb or elasticsearch) just for the purpose of searching or does that not provide any benefit in terms of reliability and the prevention of redundant information?

解决方案

I believe that keeping primary records in a SQL database and duplicating them to a noSQL database is a very common approach.

ElasticSearch has an ongoing status page about their resiliency. Even in the newest version, ElasticSearch can loose data in a number of different situations. A major change in the structure of an ElasticSearch index (such as adding analyzers) requires that you re-index all of the documents. This process is safer if you have another source for the documents. At the end of the day, ElasticSearch isn't designed to consistently store documents - I would only ever choose to use ElasticSearch as the primary store in situations where occasional data loss isn't a disaster.

Unlike ElasticSearch, MongoDB is designed to be resilient. You should be able to safely store documents in MongoDB. I've found trying to do full text searches in MongoDB can be a little painful, at least compared to ElasticSearch. In my opinion, for text search, the only advantage MongoDB has over MySQL's FULLTEXT is that it is distributed.

We are running ElasticSearch and MySQL right now - and the benefits greatly outweigh the hassles of extra infrastructure and dealing with replication between the two. We had previously attempted to use a noSQL solution as the primary datastore, with disastrous results. Running a ES in conjunction with a MySQL gets you the best of both worlds - consistency & safety of data in SQL, with the scalable, effective full text search in ES.

这篇关于全文检索数据库和200M +记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-19 22:47