本文介绍了检查Data Factory中的CSV文件编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在实现一条管道,以将CSV文件从UTF8编码的条件下,将csv文件从一个文件夹移动到数据湖中的另一个文件夹中.

I am implementing a pipeline to move csv files from one folder to another in a data lake with the condition that the CSV file is encoded in UTF8.

是否可以直接在数据工厂/数据流中检查csv文件的编码?

Is it possible to check the encoding of a csv file directly in data factory/data flow?

实际上,编码是在数据集的连接条件中设置的.如果csv文件的编码不同,在这种情况下会发生什么情况?

Actually, the encoding is set in the connection conditions of the dataset. What happens in this case, if the encoding of the csv file is different?

如果csv文件使用错误的编码暂存,在数据库级别会发生什么?

What happens at the database level if the csv file is staged with a wrong encoding?

谢谢.

推荐答案

到目前为止,我们无法直接在Data Factory/Data Flow中检查文件编码.我们必须预先设置编码类型以读取/写入测试文件:

Just for now, we can't check the file encoding in Data Factory/Data Flow directly. We must per-set the encoding type to read/write test files:

引用: https://docs.microsoft.com/zh-CN/azure/data-factory/format-delimited-text#dataset-properties

数据工厂默认文件编码为 UTF-8 .

The Data Factory default file encoding is UTF-8.

就像@wBob一样,您需要在代码级别实现编码检查,例如Azure Function或Notebook等.在管道中调用这些活动项.

Like @wBob said, you need to achieve the encoding check in code level, like Azure Function or Notebook and so on. Call these actives in pipeline.

HTH.

这篇关于检查Data Factory中的CSV文件编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-23 14:55