Skip to content

idealista/airflow-role

Repository files navigation

Apache Airflow Ansible role

GitHub release (latest by date) Ansible Galaxy Build Status

Logo

This ansible role installs a Apache Airflow server in a Debian/Ubuntu environment.

Getting Started

These instructions will get you a copy of the role for your ansible playbook. Once launched, it will install Apache Airflow in a Debian or Ubuntu system.

Prerequisites ☑️

Ansible 2.9.9 version installed. Inventory destination should be a Debian (preferable Debian 10 Buster ) or Ubuntu environment.

ℹ️ This role should work with older versions of Debian but you need to know that due to Airflow minimum requirements you should check that 🐍 Python 3.6 (or higher) is installed before (👉 See: Airflow prerequisites).

ℹ️ By default this role use the predefined installation of Python that comes with the distro.

For testing purposes, Molecule with Docker as driver.

Installing 📥

Create or add to your roles dependency file (e.g requirements.yml) from GitHub:

- src: http://github.com/idealista/airflow-role.git
  scm: git
  version: 2.0.0
  name: airflow

or using Ansible Galaxy as origin if you prefer:

- src: idealista.airflow-role
  version: 2.0.0
  name: airflow

Install the role with ansible-galaxy command:

ansible-galaxy install -p roles -r requirements.yml -f

Use in a playbook:

---
- hosts: someserver
  roles:
    - { role: airflow }

Usage 🏃

Look to the defaults properties files to see the possible configuration properties, take a look for them:

👉 Don't forget :

  • 🦸 To set your Admin user.
  • 🔑 To set Fernet key.
  • 🔑 To set webserver secret key.
  • 📝 To set your AIRFLOW_HOME and AIRFLOW_CONFIG at your own discretion.
  • 📝 To set your installation and config skelton paths at your own discretion.
    • 👉 See airflow_skeleton_paths in main.yml
  • 🐍 Python and pip version.
  • 📦 Extra packages if you need additional operators, hooks, sensors...
  • 📦 Required Python packages with version specific like SQLAlchemy for example (to avoid known Airflow bugs❗️) like below or because are necessary
  • ⚠️ With Airflow v1.10.0, PyPi package pyasn1 v0.4.4 is needed. See examples below

📦 Required Python packages

airflow_required_python_packages should be a list following this format:

airflow_required_python_packages:
  - { name: SQLAlchemy, version: 1.3.23 }
  - { name: psycopg2 }
  - {name: pyasn1, version: 0.4.4}

📦 Extra packages

airflow_extra_packages should be a list following this format:

airflow_extra_packages:
  - apache.atlas
  - celery
  - ssh

👉 For more info about this extra packages see: Airflow extra packages

Testing 🧪

pipenv install -r test-requirements.txt --python 3.7

# Optional
pipenv shell  # if in shell just use `molecule COMMAND`

pipenv run molecule test  # To run role test
# or
pipenv run molecule converge  # To run play with the role

Built With 🏗️

Ansible

Versioning 🗃️

For the versions available, see the tags on this repository.

Additionally you can see what change in each version in the CHANGELOG.md file.

Authors 🦸

See also the list of contributors who participated in this project.

License 🗒️

Apache 2.0 License

This project is licensed under the Apache 2.0 license - see the LICENSE file for details.

Contributing 👷

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.