Skip to content

Template to automate GitOps and IaC in a cloud. GitLab CI manages static and dynamic environments, which are created, updated, and destroyed by Terraform, then set up by cloud-init and Ansible.

License

xebis/infrastructure-template

Repository files navigation

Infrastructure Template

GitHub top language pre-commit Conventional Commits semantic-release

GitHub GitHub tag (latest SemVer) GitHub issues GitHub last commit pipeline status

Template to automate GitOps and IaC in a cloud. GitLab CI manages static and dynamic environments, which are created, updated, and destroyed by Terraform, then set up by cloud-init and Ansible.

The project is under active development. The project is a fork of xebis/repository-template.

The Goal

The goal is to have a GitOps repository to automatically handle environments life cycle - its creation, update, configuration, and eventually destroy as well.

GitOps = IaC + MRs + CI/CD

GitLab: What is GitOps?

Table of Contents

Features

Optimized for GitHub flow, easily adjustable to GitLab flow or any other workflow.

Example of the full workflow

Automatically checks conventional commits, validates Markdown, YAML, shell scripts, Terraform (HCL), runs static security analysis, terraform-doc, tests, deployments, releases, and so on. See GitHub - xebis/repository-template: Well-manageable and well-maintainable repository template. and Notes And References for full feature list.

Environments are managed in stages:

  • Deploy: overarching name for provision and install stages
    • Provision: environment is provisioned by Terraform at Hetzner Cloud and pre-configured by Cloud-init
    • Install: environment is installed by Ansible over SSH
  • Destroy (only dynamic environments): environment is removed by Terraform from Hetzner Cloud

Deploy and destroy in more detail

Automatically managed environments:

  • On release tag runs production environment stages
  • On main branch commit runs staging environment stages
    • Releases and creates release tag when a commit starting feat or fix is present in the history from the previous release
  • On pre-release tag runs testing/tag environment stages, and plans automatic destroy after 1 week (earlier manual destruction possible)
  • On non-main branch commit under certain conditions runs development/branch environment stages, and plans automatic destroy after 1 day (earlier manual destruction possible):
    • It runs when the environment already exists or existed in the past (when Terraform backend returns HTTP status code 200 OK for the environment state file)
    • It runs when the pipeline is run by the pipelines API, GitLab ChatOps, created by using trigger token, created by using the Run pipeline button in the GitLab UI or created by using the GitLab WebIDE
    • It runs when the pipeline is by a git push event or is scheduled pipeline, but only if there's non-empty the environment variable ENV_CREATE or CREATE_ENV
    • It doesn't run when the environment variable ENV_SKIP or SKIP_ENV is present, or the commit message contains [env skip] or [skip env], using any capitalization

Development/branch environment create or not decision:

Development environment create or not

Manually managed environments:

  • All stages must be run manually and locally

Environment

Creates one machine for development and testing environments, or intentionally zero machines for staging and production environments. Each machine uses Xebis Ansible Collection roles:

  • xebis.ansible.system: Well maintained operating system - updates and upgrades deb packages including autoremove and autoclean, reboots the system (when necessary), provides Reboot machine handler
  • xebis.ansible.firewall: Extensible nftables firewall - installs nftables and sets up basic extensible nftables chains and rules, provides Reload nftables handler, see GitHub: xebis/xebis-ansible-collection/README.md for usage, configuration, and examples
  • xebis.ansible.fail2ban: Fail2ban service - installs fail2ban and sets it up as a systemd service
  • xebis.ansible.iam: IAM - creates user groups and users as regular users or admins, their public SSH keys, disables password remote logins, provides Restart sshd handler, see GitHub: xebis/xebis-ansible-collection/README.md for usage, configuration, and examples
  • xebis.ansible.bash:Extensible Bash - installs ~/.bash_aliases and sets up basic extensible Bash aliases, see GitHub: xebis/xebis-ansible-collection/README.md for usage, configuration, and examples
  • xebis.ansible.starship: Starship CLI prompt - Installs starship and sets up improved PowerLine configuration
  • xebis.ansible.admin: Administration essentials - installs and sets up at, curl, htop, mc, screen

Each machine uses LabLabs RKE2 Ansible Role:

  • lablabs.rke2: Ansible Role to install RKE2 Kubernetes.

Caveats

One Hetzner cloud project is used for all environments, which brings a few caveats to keep in one's mind:

  • To distinguish machines between environments and to separate them from manually created machines they are named with prefix env-slug- and labeled env=env-slug by Terraform
  • To use Ansible, inventory file hcloud.yml must have replaced env-slug with an environment slug before any local manual use
  • Use Ansible group env instead of groups all or hcloud, as these groups contain all machines from all environments and eventually manually created machines as well

Images

Installation and Configuration

Prepare Hetzner Cloud API token and GitLab CI SSH keys:

  • Hetzner Cloud - referral link with €20 credit
    • Hetzner Cloud Console -> Projects -> Your Project -> Security -> API Tokens -> Generate API Token Read & Write
  • Generate GitLab CI SSH keys ssh-keygen -t rsa (no passphrase, to your secret file, do not commit it!), file with .pub extension will be generated automatically, put *.pub file contents at cloud-config.yml under section users:name=gitlab-ci to the ssh_authorized_keys as the first element, and commit it

Set up GitLab CI

  • GitLab -> Settings
    • General > Visibility, project features, permissions > Operations: on
    • CI/CD > Variables:
      • Add variable: Key HCLOUD_TOKEN, Value <token>
      • Add variable: Key GL_CI_SSH_KEY, Value contents of your secret file created by ssh-keygen -t rsa above

Set up Local Usage

Make sure GL_TOKEN: GitLab Personal Access Token with scope api is present, otherwise gitlab-ci-linter is skipped. To run Terraform provisioning Hetzner Cloud you have to set up TF_HTTP_PASSWORD, HCLOUD_TOKEN, TF_VAR_ENV_NAME, and TF_VAR_ENV_SLUG required by Terraform configuration. To load secrets you can use shell extension like direnv, encryption like SOPS, or secrets manager HashiCorp Vault, please make sure you won't commit your secrets.

export GL_TOKEN="<token>" # Your GitLab's personal access token with the api scope
export TF_HTTP_PASSWORD="$GL_TOKEN" # Set password for Terraform HTTP backend
export HCLOUD_TOKEN="<token>" # Your Hetzner API token
export TF_VAR_ENV_NAME="<environment>" # Replace with the environment name
export TF_VAR_ENV_SLUG="<env>" # Replace with the environment slug
  • Install repository dependencies by sudo scripts/bootstrap script, setup repository by scripts/setup, update repository by scripts/update script.
  • Set up all admins and users, including public SSH key at ansible/group_vars/all.yml under section users, see documentation there. Do not forget to commit it 😀

Usage

GitLab CI

  • Commit and push to run validations
  • Push a non-main branch
    • To create a development/branch environment you have to create a new pipeline for the branch using API, GitLab ChatOps, trigger token, or by using the Run pipeline button in the GitLab UI
    • Alternatively, you can create a development/branch environment directly by pushing or scheduling with ENV_CREATE or CREATE_ENV environment variable present, for example by running git push -o ci.variable="CREATE_ENV=true"
    • Once created, the environment will be updated (or recreated if it was destroyed) with each subsequent pipeline on the branch
    • Environment deploy is skipped when the environment variable ENV_SKIP or SKIP_ENV is present, or commit message contains [env skip] or [skip env], using any capitalization, or when CI pipeline is skipped altogether, for example using git push -o ci.skip
    • Destroy development/branch environment manually, or wait until auto-stop (1 day from the last commit in the branch in GitLab, could be overridden in GitLab UI)
  • Create a pre-release tag to create testing/tag environment
    • Pre-release tag must match regex ^v(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$, see https://regex101.com/r/G1OFXH/1
    • Destroy testing/tag environment manually, or wait until auto-stop (1 week, could be overridden in GitLab UI)
  • Merge to the main branch to create or update the staging environment
  • Creates or updates production environment when a commit starting feat or fix is present in the history from the previous release

Release and pre-release tags must follow SemVer string, see Semantic Versioning 2.0.0: Is there a suggested regular expression (RegEx) to check a SemVer string?

Local Usage

Initialize local workspace if not yet initialized:

# Init local workspace
pushd terraform
terraform init -reconfigure \
    -backend-config="address=https://gitlab.com/api/v4/projects/31099306/terraform/state/$TF_VAR_ENV_SLUG" \
    -backend-config="lock_address=https://gitlab.com/api/v4/projects/31099306/terraform/state/$TF_VAR_ENV_SLUG/lock" \
    -backend-config="unlock_address=https://gitlab.com/api/v4/projects/31099306/terraform/state/$TF_VAR_ENV_SLUG/lock"
  • Create or update environment by terraform apply or terraform apply -auto-approve
  • Get nodes IP addresses by terraform output nodes_ipv4_addresses
  • Direct SSH by ssh user@$(terraform output -raw nodes_ipv4_addresses)
  • Ansible:
    • Change to Ansible configuration directory pushd ../ansible
    • First replace hcloud.yml string env-slug with $TF_VAR_ENV_SLUG: sed -i "s/env-slug/$TF_VAR_ENV_SLUG/" hcloud.yml
    • List or graph inventory: ansible-inventory -i hcloud.yml --list # or --graph
    • Ping: ansible -u user -i hcloud.yml env -m ansible.builtin.ping
    • Get all facts: ansible -u user -i hcloud.yml env -m ansible.builtin.setup
    • Configure with playbook: ansible-playbook -u user -i hcloud.yml playbook.yml
    • Change back to Terraform configuration directory popd
  • Destroy environment by terraform destroy or terraform destroy -auto-approve, and go back to the repository root directorypopd

Uninitialize local workspace if you wish:

rm -rf terraform/.terraform # Uninit local workspace, this step is required if you would like to work with another environment

Commit and push to run validations.

Terraform Configuration Documentation

Contributing

Please read CONTRIBUTING for details on our code of conduct, and the process for submitting merge requests to us.

Testing

  • Git hooks check a lot of things for you, including running automated tests scripts/test full

  • Make sure all scripts/*, git hooks, and GitLab pipelines work as expected, testing checklist:

  • scripts/* scripts - covered by unit tests tests/*

  • Local working directory

    • git commit runs pre-commit hook-type commit-msg and scripts/pre-commit
    • git merge
      • Fast-forward shouldn't run any hooks or scripts
      • Automatically resolved merge commit runs pre-commit hook-type commit-msg and scripts/pre-commit
      • Manually resolved merge commit runs pre-commit hook-type commit-msg and scripts/pre-commit
    • git push runs scripts/pre-push
    • Terraform and Ansible
      • terraform init
      • terraform plan
      • terraform apply
      • ansible ... ping
      • ansible-playbook
      • terraform destroy
  • GitLab CI

    • Commit on a new non-main branch runs validate:lint and validate:test-full
      • Without any environment variables, runs provision:provision-dev, install:install-dev, and prepares destroy:destroy-dev
      • With non-empty environment variable ENV_CREATE or CREATE_ENV, runs provision:provision-dev, install:install-dev, and prepares destroy:destroy-dev
    • Commit on an existing non-main branch within 24 hours runs provision:provision-dev, install:install-dev, and prepares destroy:destroy-dev
    • Absence of commit on an existing non-main branch within 24 hours auto-stops development/branch environment
    • Pre-release tag on a non-main branch commit runs validate:lint, validate:test-full, provision:provision-test, install:install-test, and prepares destroy:destroy-test
      • After a week auto-stops testing/tag environment
    • Merge to the main branch runs validate:lint, validate:test-full, provision:provision-stage, install:install-stage, and release:release
      • With a new feat or fix, commit releases a new version
      • Release tag on the main branch commit runs validate:lint, validate:test-full, provision:provision-prod, and install:install-prod
      • Without a new feature or fix commit does not release a new version
    • Scheduled (nightly) pipeline runs validate:lint and validate:test-nightly

Test at Docker Container

To test your changes in a different environment, you might try to run a Docker container and test it from there.

Run a disposal Docker container:

  • sudo docker run -it --rm -v "$(pwd)":/infrastructure-template alpine:latest
  • sudo docker run -it --rm -v "$(pwd)":/infrastructure-template --entrypoint sh hashicorp/terraform:light
  • sudo docker run -it --rm -v "$(pwd)":/infrastructure-template --entrypoint sh gableroux/ansible:latest
  • sudo docker run -it --rm -v "$(pwd)":/infrastructure-template --entrypoint sh node:alpine

In the container:

cd infrastructure-template
# Set variables GL_TOKEN and GH_TOKEN when needed
# Put here commands from .gitlab-ci.yml job:before_script and job:script
# For example job test-full:
apk -U upgrade
apk add bats
bats tests
# Result is similar to:
# 1..1
# ok 1 dummy test

To-Do list

  • Fix workaround for pre-commit jumanjihouse/pre-commit-hooks hook script-must-have-extension - *.bats shouldn't be excluded
  • Fix workaround for pre-commit local hook shellcheck - shellcheck has duplicated parameters from .shellcheckrc, because these are not taken into account

Roadmap

Credits and Acknowledgments

Copyright and Licensing

Changelog and News

Notes and References

Dependencies

Recommendations

Suggestions

Further Reading