本文介绍了在Hadoop中使用Apache Airflow配置MySql时遇到的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在具有以下配置/版本的三个节点的dev Hadoop集群上安装和配置apache airflow:

I was trying to install and configure apache airflow on dev Hadoop cluster of a three nodes with below configurations/version:

Operating System: Red Hat Enterprise Linux Server 7.7
python 3.7.3
anaconda 2
spark 2.45

a)sudo yum install gcc gcc-c++ -y
b)sudo yum install libffi-devel mariadb-devel cyrus-sasl-devel -y
c)pip install 'apache-airflow[all]'
d)airflow initdb  -- airflow.cfgfile was created with SQLlite

然后我按照下面的命令使用mysql配置它

Then I followed below set of commands to configure it with mysql

a) rpm -Uvh https://repo.mysql.com/mysql80-community-release-el7-3.noarch.rpm
b) sed -i 's/enabled=1/enabled=0/' /etc/yum.repos.d/mysql-community.repo
c) yum --enablerepo=mysql80-community install mysql-community-server
d) systemctl start mysqld.service

在mysql上做以下事情

Done below things at mysql

a) CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci;
b) create user 'airflow'@'localhost' identified by 'Airflow123';
c) grant all privileges on * . * to 'airflow'@'localhost';

这是我的airflow.cfg文件中的一些详细信息

here are some details from my airflow.cfg file

broker_url = sqla+mysql://airflow:airflow@localhost:3306/airflow
result_backend = db+mysql://airflow:airflow@localhost:3306/airflow
sql_alchemy_conn = mysql://airflow:Airflow123@localhost:3306/airflow
executor = CeleryExecutor

运行airflow initdb命令时出现错误消息

I am getting below error while running airflow initdb commands

ImportError: /home/xyz/anaconda2/envs/python3.7.2/lib/python3.7/site-packages/_mysql.cpython-37m-x86_64-linux-gnu.so: symbol mysql_real_escape_string_quote,
version libmysqlclient_18 not defined in file libmysqlclient.so.18 with link time reference

已将.bashrc文件设置为:

have set up the .bashrc file as:

export AIRFLOW_HOME=~/airflow

这是我创建的目录:

[xyz@innolx5984 airflow]$ pwd
/home/xyz/airflow

当我寻找这个文件"libmysqlclient"时,我发现了很多实例.

When I look for this file "libmysqlclient" I have found these many instances.

[xyz@innolx5984 airflow]$ find /home/xyz/ -name "*libmysqlclient*"
/home/xyz/anaconda2/pkgs/mysql-connector-c-6.1.11-h597af5e_1/lib/libmysqlclient.so
/home/xyz/anaconda2/pkgs/mysql-connector-c-6.1.11-h597af5e_1/lib/libmysqlclient.a
/home/xyz/anaconda2/pkgs/mysql-connector-c-6.1.11-h597af5e_1/lib/libmysqlclient.so.18
/home/xyz/anaconda2/pkgs/mysql-connector-c-6.1.11-h597af5e_1/lib/libmysqlclient.so.18.4.0
/home/xyz/anaconda2/lib/libmysqlclient.a
/home/xyz/anaconda2/lib/libmysqlclient.so
/home/xyz/anaconda2/lib/libmysqlclient.so.18
/home/xyz/anaconda2/lib/libmysqlclient.so.18.4.0

仅添加一些细节以防万一.

Just adding few more details in case it helps.

[xyz@innolx5984 airflow]$ mysql_config
Usage: /home/xyz/an
aconda2/bin/mysql_config [OPTIONS]
Options:
        --cflags         [-I/home/xyz/anaconda2/include ]
        --cxxflags       [-I/home/xyz/anaconda2/include ]
        --include        [-I/home/xyz/anaconda2/include]
        --libs           [-L/home/xyz/anaconda2/lib -lmysqlclient ]
        --libs_r         [-L/home/xyz/anaconda2/lib -lmysqlclient ]
        --plugindir      [/home/xyz`/anaconda2/lib/plugin]
        --socket         [/tmp/mysql.sock]
        --port           [0]
        --version        [6.1.11]
        --variable=VAR   VAR is one of:
                pkgincludedir [/home/xyz/anaconda2/include]
                pkglibdir     [/home/xyz/anaconda2/lib]
                plugindir     [/home/xyz/anaconda2/lib/plugin]

    Looking for some help and suggestion to resolve this

问题.我不太确定是否朝正确的方向前进.

issue. I am not too sure whether heading into right direction.

推荐答案

按照以下步骤使用Anaconda3在MySQL上安装Apache Airflow

Follow these steps to install Apache Airflow with MySQL using Anaconda3

1)安装先决条件

yum install gcc gcc-c++ -y
yum install libffi-devel mariadb-devel cyrus-sasl-devel -y
dnf install redhat-rpm-config

2)安装Anaconda3(Python 3.7.6附带)

2) Install Anaconda3 (comes with Python 3.7.6)

yum install libXcomposite libXcursor libXi libXtst libXrandr alsa-lib mesa-libEGL libXdamage mesa-libGL libXScrnSaver
wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh
chmod +x Anaconda3-2020.02-Linux-x86_64.sh
./Anaconda3-2020.02-Linux-x86_64.sh

确保在安装过程中出现提示时执行conda initialize.这将确保在后续步骤中使用正确版本的python和pip.

Make sure you do conda initialize when prompted during installation.This will make sure the correct version of python and pip are used in the subsequent steps.

3)安装Apache Airflow

3) Install Apache Airflow

pip install apache-airflow[mysql,celery]

您可以根据需要添加其他子软件包.我只包括了Airflow使用MySQL数据库作为后端所需的内容.

You can add other subpackages as required. I have included only the ones required for Airflow to use MySQL database as backend.

4)初始化气流

export AIRFLOW_HOME=~/airflow
airflow initdb

从这里开始,我模仿了配置MySQL Server所遵循的步骤

From here, I have mimicked the steps you have followed to configure MySQL Server

5)安装MySQL服务器

5) Install MySQL Server

rpm -Uvh https://repo.mysql.com/mysql80-community-release-el7-3.noarch.rpm
sed -i 's/enabled=1/enabled=0/' /etc/yum.repos.d/mysql-community.repo
yum --enablerepo=mysql80-community install mysql-server
systemctl start mysqld.service

6)登录到MySQL并为Airflow配置数据库

6) Login to MySQL and configure database for Airflow

mysql> CREATE DATABASE airflow CHARACTER SET utf8 COLLATE utf8_unicode_ci;
mysql> CREATE user 'airflow'@'localhost' identified by 'Airflow123';
mysql> GRANT ALL privileges on *.* to 'airflow'@'localhost';

7)更新气流配置文件(〜/airflow/airflow.cfg)

7) Update Airflow configuration file (~/airflow/airflow.cfg)

sql_alchemy_conn = mysql://airflow:Airflow123@localhost:3306/airflow
executor = CeleryExecutor

8)初始化气流

airflow initdb

这篇关于在Hadoop中使用Apache Airflow配置MySql时遇到的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-04 06:12