You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In order to run studies against EMIS with the same workflow that we use with TPP, we need to make changes to several parts of the system. This is my initial braindump of what we need to do and I'm sure I've missed things. Please edit to add your notes, and then let's discuss in a call on Tuesday or Wednesday this week.
users need to be able to select which backend they run jobs
initially, we want to hide the ability to run against EMIS by default, and have specific opt-ins for supported repos. Longer term we will need to bake-in the concept of which backends are required or able to run a given study definition (see below). Initial implementation could just be a database flag which is false by default and we manually enable for specific repos.
/status/ will need to show information about multiple backends
Think about error reporting / sentry / etc
questions:
should a user be able to run a job against multiple backends? (@sebbacon's best placed to know what users will expect)
my (Seb's) view is that UI support would be via an extension of the usual "run" mechanism; "run all" would select everything as it currently does, but there are now N * M tickboxes where N is backends and M is actions. Users could uncheck a specific backend if they wanted (actions would be grouped visually by backend in the UI); if they want to completely exclude a backend from the UI they'd do this by editing their project.yaml (see below). "Supported backends" would show somewhere obvious on a workspace header.
longer term we will need to bake-in the concept of which backends are required or able to run a given study definition. This will probably be on a per-column basis; you might be able to extract patient ages in emis but not SGSS status, for example. Users should also be able to define which backends are included or excluded in their project.yaml, i.e. to be able to skip TPP backend completely (for example, they just don't need it, so it's faster)
level 2 access should be limited to a handful of engineers with NHSE contracts
level 3 access should be limited to researchers with appropriate NHSE contracts
level 4 access can be wider
what are the requirements here?
set up directory structure necessary for high/medium privacy outputs
ensure that we can pull repos from GH
ensure that we can push output to GH
ensure users can install opensafely cli tool and os-release script (see docs.opensafely.org)
harden and buildout software installations
consider asking for an ubuntu container within which we have root?
job runner as a service - systemd?
scripted installation - at least the basics we can build on, supplemented by installation narrative if needed as stopgap
backups if necessary
log rotation, disk space monitoring, root cron emails setup, etc, if we have root access. If not, conversation with EMIS about what they've set up
work out how to support viewing, editing, publishing outputs
For viewing outputs, a web browser should be fine (pdfs, svgs, html and text)
For diffing outputs, command line git may be sufficient but visual would be ideal: Github Desktop or at a push, gitweb?
For redacting outputs, a text editor is needed. We could consider mandating VS Code for simplicity, but worth canvassing
For publishing outputs, we may need to install Github Desktop, although again we might be able to mandate command line git (if we provide adequate documentatin)
I think it boils down to either (a) provisioning a Windows review server with access to L4 data or (b) providing a web browser and expecting command line tool usage. And I think (a) is probably unavoidable
The simplest thing that Works is to provide the value of the BACKEND environment variable to actions.
As this gives people a gun which they can aim at their feet, this should come with best practice guidance:
How to write a study definition with conditionals
How to structure your pipeline so most of its actions are indifferent to which backend they are run on (this might be "write a normalising action as the second pipeline step", for example)
How to test against different backends
ACTION: write up some example project.yaml and consider the implication in implementation, particularly regarding complicating dependency resolution. You would want to take a single tree of multiple dependencies and project that into a single tree as soon as possible and inspect that.
ACTION: We should also add a top-level run_on (or similar) key to project.yaml which can default to tpp (for backwards compatibility).
ACTION: We need to consider that publishing outputs will often result in name collisions; the os-release script should be updated to namespace with the backend.
ACTION: Also consider including backend in jobrunner output paths; will "backends exist" help as a concept for people to understand when doing local runs? There is also an argument for including the backend in filenames to help analysts when they are manually combining and comparing outputs from different backends
ACTION: wireframe the resulting UI to help users visualise what they've requested when multiple backends are involved
ACTION: sysadmin side should be scripts in opensafely/sysadmin repo and sufficient accompanying playbook documentation to make it easy for us to set it up again
ACTION: consider RDP server & ssh authentication in EMIS environment; test Github Desktop in that environment
Uh oh!
There was an error while loading. Please reload this page.
In order to run studies against EMIS with the same workflow that we use with TPP, we need to make changes to several parts of the system. This is my initial braindump of what we need to do and I'm sure I've missed things. Please edit to add your notes, and then let's discuss in a call on Tuesday or Wednesday this week.
opensafely
cli tool andos-release
script (see docs.opensafely.org)The text was updated successfully, but these errors were encountered: