本文介绍了ElasticSearch作为主要数据存储区的可靠性,如写入丢失,数据可用性等因素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一个项目,要求提供一个通用仪表板,用户可以在不同的领域进行不同类型的分组,过滤和深入挖掘。为此,我们正在寻找一个允许切片和骰子数据的搜索商店。



将有多个数据来源,并将其存储在搜索商店中。对于源数据可能需要一些预先计算,可以由中间组件完成。



我已经浏览了几个博客来了解ES是否可以可靠地使用作为主数据存储也是。它主要取决于我们正在寻找的用例。有关我们拥有的用例的一些信息:




  • 每年大约有3千万条记录,1-2 KB。

  • 假设存储1年数据,我们今天有300 GB,但随着数据增长,用例可以达到400-500 GB。

  • 截至目前不确定,我们将如何推送数据,但大致来说,每5分钟可能会达到2-3百万条记录。

  • 搜索请求低,但需要复杂的查询才能搜索最近6周至6个月的数据。

  • 文档将被索引到文档中的几乎所有字段。



有些博客说它足够可靠地用作主数据存储 -















    有没有人使用弹性搜索作为数据的唯一真实性,而没有像PostgreSQL这样的主存储, DynamoDB或RDS?我已经看到,ES有一些问题,如分裂大脑和索引损坏,数据丢失可能会出现问题。所以,我想知道有没有人使用ES,并且遇到任何麻烦的数据



    谢谢。

    解决方案

    简单的答案:这取决于你的用例,但你可能不想使用它作为主要商店。



    更长的回答:您应该真正了解可能出现的有关弹性和数据丢失的所有可能问题。弹性有一些,您应该在使用它之前应该明白作为主要数据存储。此外,是一个很好的资源。



    如果您了解您正在承担的风险,并且您认为这些风险是可以接受的(例如,因为小数据丢失对您的应用程序不是问题),那么您应该感到自由继续尝试。


    I am working on a project with a requirement of coming up with a generic dashboard where a users can do different kinds of grouping, filtering and drill down on different fields. For this we are looking for a search store that allows slice and dice of data.

    There would be multiple sources of data and would be storing it in the Search Store. There may be some pre-computation required on the source data which can be done by an intermediate components.

    I have looked through several blogs to understand whether ES can be used reliably as a primary datastore too. It mostly depends on the use-case we are looking for. Some of the information about the use case that we have :

    • Around 300 million record each year with 1-2 KB.
    • Assuming storing 1 year data, we are today with 300 GB but use-case can go up to 400-500 GB given growth of data.
    • As of now not sure, how we will push data, but roughly, it can go up to ~2-3 million records per 5 minutes.
    • Search request are low, but requires complex queries which can search data for last 6 weeks to 6 months.
    • document will be indexed across almost all the fields in document.

    Some blogs say that it is reliable enough to use as a primary data store -

    And some blogs say that ES have few limitations -

    Has anyone used Elastic Search as the sole truth of data without having a primary storage like PostgreSQL, DynamoDB or RDS? I have looked up that ES has certain issues like split brains and index corruption where there can be a problem with the data loss. So, I am looking to know if anyone has used ES and have got into any troubles with the data

    Thanks.

    解决方案

    Short answer: it depends on your use case, but you probably don't want to use it as a primary store.

    Longer answer: You should really understand all of the possible issues that can come up around resiliency and data loss. Elastic has some great documentation of these issues which you should really understand before using it as a primary data store. In addition Aphyr's post on the topic is a good resource.

    If you understand the risks you are taking and you believe that those risks are acceptable (e.g. because small data loss is not a problem for your application) then you should feel free to go ahead and try it.

    这篇关于ElasticSearch作为主要数据存储区的可靠性,如写入丢失,数据可用性等因素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 10:45