Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.2 Automated deployment and initialization of new CKAN instances on a PaaS (e.g. AWS) #15

Open
rossjones opened this issue Dec 10, 2014 · 5 comments

Comments

@rossjones
Copy link

As a Cloud Admin I want to create (and manage) a CKAN instance (site) in my cloud 
farm so that it is live and available online at a URL

This user story is really a very high-level user story and there are a lot of smaller 
ones corresponding to key desired activities such as

* Create and remove a farm “environment” (e.g. DB server, VPN etc)
* Create an instance within that environment (installed and ready but not live online)
* Activate an instance (make it live online)
    a.     Setup any associated monitoring
* De-activate (take it offline but don’t destoy it)
* Purge (Destroy it plus all data - perhaps with backup)
* Plugin install, activate, deactivate, deinstall (per instance (?))

Implementation notes:
    * Support must be provided for one or more major PaaS such as AWS 
       or OpenStack
    * This process should be fully automated - so e.g. booting a new instance should 
       be one command on the command line or a click of a button
    * This functionality should be wrapped in a python library so that it can be used
       to power a web application or similar (see later user stories)
    * Relevant information arising from all these operations (e.g. details of the 
       farm, details of instances must be persisted)
       a. Details are not fully determined and left to implementor (and will intersect 
             with other later user stories e.g. re creating UIs). Suggestion is that config 
             either be simple JSON or a basic DB
    *  Bonus: nice UI for launching and monitoring (see next item)
@jqnatividad
Copy link
Contributor

For AWS, Elastic Beanstalk supports this user story, and even supports additional features like using a managed database (RDS) and auto-scaling.

It also supports deploying from Docker Containers, which might be the way to abstract PaaS dependencies.

@waldoj waldoj changed the title 2.2 Automated deployment and initialization of new CKAN instances on a PaaS(e.g. AWS) 2.2 Automated deployment and initialization of new CKAN instances on a PaaS (e.g. AWS) Dec 17, 2014
@waldoj
Copy link
Member

waldoj commented Dec 19, 2014

I find compelling Florian Mayer’s description of his Docker-based deployment, on ckan-dev, in which he writes:

we're deploying our CKAN using Docker linux containers. In our docker image
build process we copy out storage folders and the database from the (non
persistent) container into persistent directories within a BTRFS
snapshotting file system.
That simplifies a few things for us:

  • All read-only files (software, config, dependencies) are located within
    the Docker image, which also contains all installed extensions,
  • All read/write files, the installed / set up / populated database, plus
    uploaded attachments are located within the persistent folder, making
    migration a "build the image and copy the persist folder" job,
  • the snapshotting file system allows us to roll back the CKAN instance to
    a sane state, should bad things happen, instead of having to
    migrate/install.

In particular, I like the notion of keeping files that should be read-only as actual read-only files. That's better for security, that simplifies caching (read-only files are not going to change), and it simplifies backup.

@florianm
Copy link

Hi @waldoj,

we actually moved away from Docker containers towards a dedicated AWS VM.

Our docker setup will probably be useful for a stable, only occasionally updated CKAN version. I guess we'll docker CKAN 2.4.

The AWS VM in contrast is perfect for tinkering with the latest master branches of various plugins, and we also snapshot our file system (btrfs ftw!) so we can recover from git mess-ups. Just as in the Docker setup, we separated out valuables (postgres datadir and storage dir) into a dedicated (also snapshotted) folder.
I would probably not run things this way with software in charge of finances or emergency calls, but CKAN? Absolutely fine. Never more than a ssh session or git checkout or, worst case, a filesystem restore away from sanity.

Our main reason for moving to AWS was that the latest CKAN master with a few customised extensions fixes some critical bugs (resources disappearing was a big one) and gives us some custom required features. However, rolling that into a Docker image would take an order of magnitude higher effort and be outdated too quickly.

@waldoj
Copy link
Member

waldoj commented Jan 15, 2015

Our docker setup will probably be useful for a stable, only occasionally updated CKAN version. I guess we'll docker CKAN 2.4. The AWS VM in contrast is perfect for tinkering with the latest master branches of various plugins, and we also snapshot our file system (btrfs ftw!) so we can recover from git mess-ups.

This was a really helpful distinction, @florianm—thank you for breaking it down like this!

@wardi
Copy link

wardi commented Mar 21, 2015

ckan-multisite will be set up to run on any bare metal server or vps that allows you to run docker. If you want to mount your databases and files with a snapshotting file system you're free to do that because they're all stored in predictable locations on disk. Backups are easy: all the user data is in one place on the host filesystem (mounted as volumes by datacats) and the code is in another.

Creating and deploying instances will be from a web interface. Creating and removing a "farm environment" means installing or removing ckan-multisite, which will have simple instructions and few dependencies (docker, nginx, pip, virtualenv...).

I've created an issue for the install procedure datacats/ckan-multisite#5 and another for documenting the file locations on the server datacats/ckan-multisite#6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants