本文介绍了使用ClickHouse的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在改进我们的现有系统,该系统使用MYSQL DB处理以下类型的数据。

We are revamping our existing system, which uses MYSQL DB to deal with the following type of data.


  • 与交易和订单相关的数据

  • 客户信息

  • 产品信息

我们需要查询这些数据并提取统计数据,还需要筛选,构面和细分列表以及KPI。

We need to query on these data and pull in statistical data, and also filter, facet and segment list and KPIs.

我们尝试了ClickHouse,Druid,DGraph对

We tried ClickHouse, Druid, DGraph did a few tests on sample data to benchmark and to check which DB fits our needs.

我对Druid DB感兴趣的几件事是

Few things I liked about Druid DB are,


  • 德鲁伊搜索查询:列出所有匹配项以及维(列名)和相同项的计数/出现次数。
    链接:

  • utf8mb4支持

  • 全文搜索

  • 不区分大小写的搜索

  • Druid Search Queries: Which lists down all the matches along with the dimensions(column names) and count/occurrence for the same.Link: http://druid.io/docs/latest/querying/searchquery.html
  • utf8mb4 support
  • Full text search
  • Case insensitive search

与MYSQL和Druid数据库相比,我们发现ClickHouse的速度更快。但是有以下问题。

We found ClickHouse to be faster when compared to MYSQL and Druid databases. But have the following problems.


  • 无法执行类似druid的搜索查询(返回维度和出现次数)。要解决此问题,有什么解决方法?

  • 不区分大小写的搜索。我们该如何处理? ClickHouse区分大小写,对吧?

  • utf8mb4支持吗?我们如何保存/存储utf8不支持的特殊字符或少数表情符号?

    我们在MYSQL中遇到了类似的问题,将排序规则更改为utf8mb4即可解决。我们在ClickHouse中可以实现什么?

您的建议可以帮助我们克服这些挑战并做出更好的决定。

Your suggestions can help us overcome these challenges and make a better decision.

预先感谢。

推荐答案

该功能听起来像是这样:

That feature sounds to work roughly like:

SELECT interval, dim1, COUNT(*) FROM my_table WHERE condition GROUP BY interval, dim1
UNION ALL
SELECT interval, dim2, COUNT(*) FROM my_table WHERE condition GROUP BY interval, dim2
UNION ALL
...



有多个选项,例如 positionCaseInsensitiveUTF8(干草堆,针) 函数或与正则表达式匹配:

There are multiple options, for example positionCaseInsensitiveUTF8(haystack, needle) function or match with regular expressions: https://clickhouse.yandex/docs/en/query_language/functions/string_search_functions/#match-haystack-pattern

ClickHouse中的字符串是任意字节序列,因此您可以在其中存储任何内容,但是您应该检查可用功能是否匹配您的用例。

Strings in ClickHouse are arbitrary byte sequences, so you can store whatever you want there, but you should probably check whether the available functions match your usecase.

这篇关于使用ClickHouse的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-19 03:15