airflow详细安装过程

前端之家收集整理的这篇文章主要介绍了airflow详细安装过程前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。

airflow是Airbnb开源出的一个数据流管理工具,关于使用,可参考官网http://pythonhosted.org/airflow/

现将安装过程及踩过的坑分享给大家。

安装airflow

(为了避免对其他程序造成影响,故不想替换掉原有的python2.6.6,此处希望2.6与2.7两个版本共存,而且安装的pip、virtualenv等软件,也只希望在python27中存在)

安装独立的python2.7,只需要在configure时指定prefix为不同的目录即可,这样make install时就会安装到prefix目录,而不是/usr/local/bin

1、下载python2.7.11源码,https://www.python.org/downloads/source/

2、源码安装

su - root
cd /usr/local/
tar -zxvf Python-2.7.11.tgz
mv Python-2.7.11 python27
cd python27
./configure --prefix=/usr/local/python27 #(修改为自己的路径)
make
make install

3、安装setuptools(需要将setuptools安装到python27下面,服务器不能连接外网,故下载源码)

tar zvxf setuptools-23.1.0.tar.gz
cd setuptools-23.1.0/
/usr/local/python27/python setup.py install
4、安装pip(需要将pip安装到python27下面,服务器不能连接外网,故下载源码)(pypi可设置为豆瓣的库)
tar zvxf pip-8.1.2.tar.gz  
cd pip-8.1.2/
/usr/local/python27/python setup.py install

5、安装virtualenv,其他安装方式参考官网https://virtualenv.pypa.io/en/latest/index.html

tar zvxf virtualenv-15.0.2.tar.gz
cd virtualenv-15.0.2/
/usr/local/python27/python setup.py install
还需在 python2.6 下安装一次,否则在 python2.6 下创建 python2.7 virtualenv 时无法执行

6、由于执行virtualenv命令时,需要联网,所以还是需要设置代理,这里使用ccproxy

下载地址http://www.ccproxy.com/

需要在linux上设置环境变量

export https_proxy=xxx.xxx.xxx.xxx:808
export http_proxy=xxx.xxx.xxx.xxx:808

7、使用virtualenv生成临时环境

virtualenv --pythonp=/usr/local/python27/bin/pythonairflowenv

这样 source airflowenv/bin/activate之后,就是使用python2.7的shell了

8、安装MysqL,不做赘述

9、使用root用户安装MysqL-devel,yum install MysqL-devel

10、安装MysqL-python,python官网下载MysqL-python-1.2.5.zip,解压缩

source airflowenv/bin/activate
cd MysqL-python-1.2.5
python setup.py install

11、安装gevent

source airflowenv/bin/activate
pip install gevent

12、安装airflow

source airflowenv/bin/activate
export AIRFLOW_HOME=~/airflow (修改为自己的路径)
pip install airflow
# initialize the database
airflow initdb
13、vi $AIRFLOW_HOME/airflow.cfg文件

包括添加MysqL的连接,设置executor等,其他参数请根据实际需要调整

executor = LocalExecutor
sql_alchemy_conn = MysqL://username:password@ip:port/dbname

14、再次执行airflowinitdb,此时将在MysqL中创建表

15、安装supervisor,使用supervisor启动airflow,一旦airflow挂掉,supervisor会自动重启airflow

source airflowenv/bin/activate
pip install supervisor
编辑supervisord.conf文件,指定要启动的程序和日志输出路径
[program:airflow_scheduler]
command=/xxx/airflowenv/bin/airflow scheduler
stdout_logfile=/tmp/airflow_scheduler.log

使用如下命令启动

supervisord -c /xxx/xxx/airflow/supervisord.conf


安装遇到的问题

1、airflowinitdb报错

(airflowenv)root@127.0.0.1:/xxx/xxx/airflowenv/bin$ airflow initdb

Traceback (most recent call last):

File "/xxx/xxx/airflowenv/bin/airflow",line 4,in <module>

from airflow import configuration

File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/airflow/__init__.py",line 31,in <module>

from airflow.models import DAG

File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/airflow/models.py",line 56,in <module>

from airflow import settings,utils

File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/airflow/settings.py",line 76,in <module>

engine = create_engine(sql_ALCHEMY_CONN,**engine_args)

File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/sqlalchemy/engine/__init__.py",line 386,in create_engine

return strategy.create(*args,**kwargs)

File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/sqlalchemy/engine/strategies.py",line 75,in create

dbapi = dialect_cls.dbapi(**dbapi_args)

File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/sqlalchemy/dialects/MysqL/MysqLdb.py",line 92,in dbapi

return __import__('MysqLdb')

ImportError: No module named MysqLdb

缺少MysqL-python模块,官网下载MysqL-python-1.2.5.zip,解压缩,

cd MysqL-python-1.2.5

python setup.py install

2、安装MysqL-python后执行airflow initdb报错,

_MysqL.c:36:23: error:my_config.h: No such file or directory

_MysqL.c:38:19: error:MysqL.h: No such file or directory

_MysqL.c:39:26: error:MysqLd_error.h: No such file or directory

_MysqL.c:40:20: error:errmsg.h: No such file or directory

linux缺少MysqL-devel包,使用yum install MysqL-devel,或手工下载MysqL-devel的rpm包,自己安装

3、执行airflow webserver -p 8080启动webserver报错

Error: class uri 'gevent' invalid ornot found:

[Traceback (most recent call last):

File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/gunicorn/util.py",line 140,in load_class

mod = import_module('.'.join(components))

File "/xxx/xxx/software/python27/lib/python2.7/importlib/__init__.py",line 37,in import_module

__import__(name)

File "/xxx/xxx/airflowenv/lib/python2.7/site-packages/gunicorn/workers/ggevent.py",line 22,in <module>

raise RuntimeError("You need gevent installed to use thisworker.")

RuntimeError: You need geventinstalled to use this worker.

]

使用pip命令安装gevent pip install gevent

猜你在找的设计模式相关文章