Camus迁移-Kafka HDFS Connect并非从设置的偏移量开始

本文介绍了Camus迁移-Kafka HDFS Connect并非从设置的偏移量开始的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在使用Confluent HDFS Sink连接器(v4.0.0)替换Camus.我们正在处理敏感数据，因此我们需要在过渡到连接器的过程中保持偏移量的一致性.

I am currently using Confluent HDFS Sink Connector (v4.0.0) to replace Camus. We are dealing with sensitive data so we need to maintain consistency in offset during cutover to connectors.

转换计划:

我们创建了hdfs接收器连接器，并订阅了一个写入临时hdfs文件的主题.这将创建一个名称为 connect-
使用DELETE请求终止了连接器.
使用 /usr/bin/kafka-consumer-groups 脚本，我可以将连接器使用者组kafka主题分区的当前偏移量设置为所需的值(即卡缪斯最后一次写错了+ 1).
当我重新启动hdfs接收器连接器时，它将继续从上次提交的连接器偏移量读取数据，并忽略设置值.我期望hdfs文件名像这样:hdfs_kafka_topic_name + kafkapartition + Camus_offset + Camus_offset_plus_flush_size.format

We created hdfs sink connector and subscribed to a topic which writes to a temporary hdfs file. This creates a consumer group with name connect-
Stopped the connector using DELETE request.
Using /usr/bin/kafka-consumer-groups script, I am able to set the connector consumer group kafka topic partition's current offset to a desired value (i.e. last offset Camus wrote + 1).
When i restart the hdfs sink connector, it continues reading from the last committed connector offset and ignores the set value. I am expecting the hdfs file name to be like:hdfs_kafka_topic_name+kafkapartition+Camus_offset+Camus_offset_plus_flush_size.format

我对融合连接器行为的期望正确吗?

Is my expectation of confluent connector behavior correct ?

HDFS

Camus迁移-Kafka HDFS Connect并非从设置的偏移量开始

问题描述

推荐答案