本文介绍了ClickHouse JSON解析异常:无法解析输入:预期的','之前的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从Kafka将JSON数据添加到ClickHouse.这是简化的JSON:

I'm trying to add JSON data to ClickHouse from Kafka. Here's simplified JSON:

{
  ...
   "sendAddress":{
      "sendCommChannelTypeId":4,
      "sendCommChannelTypeCode":"SMS",
      "sendAddress":"789345345945"},
   ...
}

以下是在ClickHouse中创建表,使用Kafka Engine创建另一个表以及创建MATERIALIZED VIEW来连接这两个表以及将CH与Kafka连接的步骤.

Here's the steps for creating table in ClickHouse, create another table using Kafka Engine and creating MATERIALIZED VIEW to connect these two tables, and also connect CH with Kafka.

创建第一个表

CREATE TABLE tab
(
    ...

    sendAddress Tuple (sendCommChannelTypeId Int32, sendCommChannelTypeCode String, sendAddress String),
     ...

)Engine = MergeTree()
PARTITION BY applicationId
ORDER BY (applicationId);

使用Kafka Engine设置创建第二张表:

Creating a second table with Kafka Engine SETTINGS:

CREATE TABLE tab_kfk
(
    ...
    sendAddress Tuple (sendCommChannelTypeId Int32, sendCommChannelTypeCode String, sendAddress String),
    ...
)ENGINE = Kafka
SETTINGS kafka_broker_list = 'localhost:9092',
       kafka_topic_list = 'topk2',
       kafka_group_name = 'group1',
       kafka_format = 'JSONEachRow',
       kafka_row_delimiter = '\n';

创建材料视图

CREATE MATERIALIZED VIEW tab_mv TO tab AS
SELECT ... sendAddress, ...
FROM tab_kfk;

然后,我尝试从第一个表-选项卡中选择所有或特定项目,但一无​​所获.日志遵循

Then I try to SELECT all or specific items from the first table - tab and get nothing. Logs is following

好.只需在sendAddress中的花括号前添加"[]",如下所示:

OK. Just add '[]' before curly braces in the sendAddress like this:

"authkey":"some_value",
   "sendAddress":[{
      "sendCommChannelTypeId":4,
      "sendCommChannelTypeCode":"SMS",
      "sendAddress":"789345345945"
   }]

我仍然犯了一个错误,但略有不同:我该怎么办才能解决此问题,谢谢!

And I still get a mistake, but slightly different:What should I do to fix this problem, thanks!

推荐答案

有3种解决方法:

  1. 在插入Kafka主题之前,请勿使用嵌套对象和拼合消息.例如这样的方式:
{
    ..
    "authkey":"key",
    "sendAddress_CommChannelTypeId":4,
    "sendAddress_CommChannelTypeCode":"SMS",
    "sendAddress":"789345345945",
    ..
}
  1. 使用嵌套的数据结构更改JSON消息模式和表模式所需的内容:
  1. Use Nested data structure that required to change the JSON-message schema and table schema:
{
    ..
    "authkey":"key",
    "sendAddress.sendCommChannelTypeId":[4],
    "sendAddress.sendCommChannelTypeCode":["SMS"],
    "sendAddress.sendAddress":["789345345945"],
    ..
}
CREATE TABLE tab_kfk
(
    applicationId Int32,
    ..
    sendAddress Nested(
        sendCommChannelTypeId Int32,
        sendCommChannelTypeCode String,
        sendAddress String),
    ..
)
ENGINE = Kafka
SETTINGS kafka_broker_list = 'localhost:9092',
       kafka_topic_list = 'topk2',
       kafka_group_name = 'group1',
       kafka_format = 'JSONEachRow',
       kafka_row_delimiter = '\n',
       input_format_import_nested_json = 1 /* <--- */

请考虑设置 input_format_import_nested_json ./p>

  1. 将输入的JSON消息解释为字符串&手动解析(请参见 github问题#16969 ):
CREATE TABLE tab_kfk
(
    message String
)
ENGINE = Kafka
SETTINGS
    ..
    kafka_format = 'JSONAsString', /* <--- */
    ..

CREATE MATERIALIZED VIEW tab_mv TO tab
AS
SELECT
    ..
    JSONExtractString(message, 'authkey') AS authkey,
    JSONExtract(message, 'sendAddress', 'Tuple(Int32,String,String)') AS sendAddress,
    ..
FROM tab_kfk;

这篇关于ClickHouse JSON解析异常:无法解析输入:预期的','之前的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-19 04:18