本文介绍了如何从Dask-Yarn作业中捕获工人的日志?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试在〜/ .config / dask / distributed.yaml 〜/ .config / dask / yarn.yaml中使用以下内容

logging-file-config: "/path/to/config.ini"

logging:
  version: 1
  disable_existing_loggers: false

  root:
    level: INFO
    handlers: [consoleHandler]

  handlers:
    consoleHandler:
      class: logging.StreamHandler
      level: INFO
      formatter: sample_formatter
      stream: ext://sys.stderr

  formatters:
    sample_formatter:
      format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'

,然后在我的函数中得到工作人员的评估:

and then in my function that gets evaluated at the worker:

import logging
from distributed.worker import logger
import dask
from dask.distributed import Client
from dask_yarn import YarnCluster

log = logging.getLogger(__name__)

@dask.delayed
def worker_func(args):
    logger.info("This will show up in the worker logs")
    log.info("This does not show up in worker logs")
    return

if __name__ == "__main__":
    dag_1 = {'worker_func': (worker_func, arg_1)}
    tasks = dask.get(dag_1, 'load-1')

    log.info("This also shows up in logs, and custom formatted)
    cluster = YarnCluster()
    client = Client(cluster)
    dask.compute(tasks)

当我尝试使用以下方式查看纱线记录时:

When I try to view the yarn logs using:

yarn logs -applicationId {application_id}

我没有看到 log.info 里面的日志 worker_func ,但我确实从 distributed.worker.logger 以及控制台上该功能之外看到了日志。我还尝试使用。请注意,这些日志将仅包含写入 distributed.worker 记录器的日志。您不能写自己的记录器,而将它们显示在 client.get_worker_logs()的输出中。要写入此记录器,请通过

Logs for running dask-yarn applications can be retrieved using client.get_worker_logs(). Note that these logs will only contain logs written to the distributed.worker logger. You cannot write to your own logger and have them appear in the output of client.get_worker_logs(). To write to this logger, get it via

import logging
logger = logging.getLogger("distributed.worker")
logger.info("Writing with the worker logger")

任何配置适当的记录器登录到 stdout stderr 的日志将显示在通过yarn CLI访问的日志中,但是只有 distributed.worker 记录器的输出也可用于 get_worker_logs()

Any logger appropriately configured to log to stdout or stderr will appear in the logs accessed via the yarn CLI, but only the distributed.worker logger output will also be available to get_worker_logs().

旁注

配置文件的名称没关系,dask会将所有 yaml 文件加载到所有配置目录中并合并它们的内容。有关更多信息,请阅读

The name of the config files doesn't matter, dask loads all yaml files in all config directories and merges their contents. For more information please read the configuration docs

这篇关于如何从Dask-Yarn作业中捕获工人的日志?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-11 00:58