本文介绍了使用SubDagOperator的schedule_interval和其他陷阱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

气流 明确指出

尽管我们必须遵守文档,但我发现即使将 schedule_interval 设置为 None ,它们也不会打ic或 @一次。 我的工作示例。

Although we must stick to the documenation, I've found they work without a hiccup even with schedule_interval set to None or @once. Here's my working example.

我目前的理解(仅2周后我听说了 Airflow SubDagOperator s(或 subdag s)是

My current understanding (I heard about Airflow only 2 weeks back) of SubDagOperators (or subdags) is


  • Airflow subdag 视为另一个任务

  • 他们,但存在

  • Airflow treats a subdag as just another task
  • They can cause deadlock but easy workarounds exist

我的问题是


  • 为什么我的示例在不应该工作的情况下起作用?

  • 为什么我的示例不应该工作(按照)吗?

  • SubDagOperator的行为之间存在细微差别和其他操作符 s?

  • 当存在已知问题的解决方案时,为什么会有这么多?

  • Why does my example work when it shouldn't?
  • Why shouldn't my example work (as per the docs) in the first place?
  • Any subtle differences between behaviour of SubDagOperator and other operators?
  • When solutions of known problems exist, why is there so much uproar against SubDagOperators?

I' m使用与


  • 气流1.9.0-4

  • Python 3.6-slim

  • CeleryExecutor redis:3.2.7

  • Airflow 1.9.0-4
  • Python 3.6-slim
  • CeleryExecutor with redis:3.2.7

推荐答案

如果仅运行一次DAG,则可能没有SubDags的任何问题(例如您的示例)-特别是如果您有很多可用的工作槽。尝试让示例中的一些DagRun累积起来,如果尝试删除并重新运行某些示例,则看一切运行是否顺利。

If you are just running your DAG once, then you probably won't have any issues with SubDags (as in your example) - especially if you have a bunch of worker slots available. Try letting a few DagRuns of your example accumulate and see if everything runs smoothly if you try to delete and re-run some.

社区建议不要使用SubDags,因为当您需要重新运行旧的DagRun或运行更大的回填时,就会发生意外行为。

The community has advised moving away from SubDags because unexpected behavior starts happening when you need to re-run old DagRuns or run bigger backfills.

不是DAG不能正常工作,而是更多可能发生的意外事件可能会影响您的工作流程,这是不值得的。获得回报是更好看的DAG。

It is not so much that the DAG won't work, but more that unexpected can happen that may affect your workflows that isn't worth the risk when all you are getting in return is a nicer looking DAG.

即使存在已知的解决方案,实现它们也可能不值得。

Even though known solutions exist, implementing them may not be worth the effort.

这篇关于使用SubDagOperator的schedule_interval和其他陷阱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-26 21:48