本文介绍了有人可以为气流python3建议替代HdfsSensor吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用python3中的HdfsSensor来监听HDFS中的变化,以触发Airflow中的ETL管道.我收到以下错误,因为python3不支持snakebite

I am trying to listen to changes in HDFS to trigger my ETL pipeline in Airflow using HdfsSensor in python3. I am getting the following error as snakebite is not supported for python3

此HDFSHook实现需要蛇咬,但ImportError:此HDFSHook实现需要蛇咬,但蛇咬与Python 3不兼容

This HDFSHook implementation requires snakebite, but 'ImportError: This HDFSHook implementation requires snakebite, but snakebite is not compatible with Python 3

推荐答案

由于@AyushGoyal的建议,我使用WebHDFSSensor解决了相同的问题.该传感器看起来像HdfsSensor,您可以替换功能名称.只要记住确保:

Thanks to the suggestion by @AyushGoyal, I solved the same problem using WebHDFSSensor. This sensor looks like HdfsSensor and you can just replace the function names. just remember to make sure:

  • 您通过webhdfs_conn_id参数传递连接ID(在HdfsSensor中,参数名称为hdfs_conn_id)
  • 您应尝试连接到名称节点的端口是50700(而不是8020)
  • you pass the connection id via webhdfs_conn_id parameter (in HdfsSensor the parameter name was hdfs_conn_id)
  • the port with which you should try to connect to name node is 50700 (not 8020)

其余的都一样!例如:

The rest is the same!example:

from airflow.sensors.web_hdfs_sensor import WebHdfsSensor
file_sensor = WebHdfsSensor(
task_id='check_if_data_is_ready',
filepath="some_file_path",
webhdfs_conn_id='hdfs_conn_id',
poke_interval=10,
timeout=5,
dag=dag,
env={
    'JAVA_HOME': '/usr/java/latest'
}

)

这篇关于有人可以为气流python3建议替代HdfsSensor吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-27 23:20