diff --git a/.github/ISSUE_TEMPLATE/general.yml b/.github/ISSUE_TEMPLATE/general.yml index 15928a7..cfc2d60 100644 --- a/.github/ISSUE_TEMPLATE/general.yml +++ b/.github/ISSUE_TEMPLATE/general.yml @@ -11,7 +11,6 @@ body: This issue will be added to the backlog of our project board. If you would like to take it on, please assign it to yourself! All issues need to have someone assigned to/owning them in order to progress. Whilst this is not the case we'll keep the issue safe in the backlog! - - type: textarea id: detail attributes: @@ -19,6 +18,13 @@ body: description: Please explain your issue in detail validations: required: true + - type: textarea + id: actions + attributes: + label: Actions + description: Please add any actions to be undertaken below. You can create a checklist by typing `- [ ] x` for each action point + validations: + required: true - type: textarea id: tags attributes: diff --git a/README.md b/README.md index df0be67..ecf420d 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,7 @@ This repository is for the UK-TRE community website hosted on Read the Docs http Anyone can join our mailing list and attend our meetings, you do not need to provide any information other than your email address. -:Mailing list: https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=RSE-TRE-COMM&A=1 +:Mailing list: https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=UK-TRE-COMM&A=1 :Slack channel: https://ukrse.slack.com/archives/C045ETUPPD0 # :family: Community and Support diff --git a/docs/Makefile b/docs/Makefile index d4bb2cb..b50c24d 100644 --- a/docs/Makefile +++ b/docs/Makefile @@ -3,7 +3,7 @@ # You can set these variables from the command line, and also # from the environment for the first two. -SPHINXOPTS ?= +SPHINXOPTS ?= -W SPHINXBUILD ?= sphinx-build SOURCEDIR = . BUILDDIR = _build diff --git a/docs/_templates/footer-links.html b/docs/_templates/footer-links.html index 10f0a48..576d8a3 100644 --- a/docs/_templates/footer-links.html +++ b/docs/_templates/footer-links.html @@ -1,4 +1,4 @@ - Mailing list
+ +
Open in new window
+ + +``` + +You can subscribe to the calendar, e.g. in Outlook or Google Calendar, by importing [this ICS link](https://ics.teamup.com/feed/ksbqmbhymxdbu454aw/0.ics) to your calendar. + +You can also subscribe to a subset of events: + +- [Official events](https://ics.teamup.com/feed/ksbqmbhymxdbu454aw/13011531.ics) +- [Other TRE events](https://ics.teamup.com/feed/ksbqmbhymxdbu454aw/13011530.ics) +- [WG - Community management](https://ics.teamup.com/feed/ksbqmbhymxdbu454aw/13014371.ics) +- [Working groups](https://ics.teamup.com/feed/ksbqmbhymxdbu454aw/13014372.ics) + +Here you'll find events the community is organising and engaged in. You can also find reports on past events hosted by the community, as well as a schedule of upcoming events and information on how to get involved. + ```{toctree} :maxdepth: 2 wg_workshops/index ``` - -Here you'll find events the community is organising and engaged in. You can also find reports on past events hosted by the community, as well as a schedule of upcoming events and information on how to get involved. diff --git a/docs/events/wg_workshops/2023-03-29-march-meeting/index.md b/docs/events/wg_workshops/2023-03-29-march-meeting/index.md index b8a6ea8..84ebc75 100644 --- a/docs/events/wg_workshops/2023-03-29-march-meeting/index.md +++ b/docs/events/wg_workshops/2023-03-29-march-meeting/index.md @@ -229,7 +229,6 @@ The group discussed first the feasibility of a common language around risk and t - [Alan Turing Institute](https://arxiv.org/pdf/1908.08737.pdf) - Sheffield used this as the basis of their system for assessing risk. -- [NIST RMF](https://csrc.nist.gov/projects/risk-management/about-rmf) - [NCSC](https://www.ncsc.gov.uk/collection/risk-management) - [Harvard DataTags](https://github.com/IQSS/DataTaggingLibrary) - [UK Data Service data types](https://ukdataservice.ac.uk/help/access-policy/types-of-data-access/) diff --git a/docs/events/wg_workshops/2023-03-29-march-meeting/project-systematic-data-risk-classification.md b/docs/events/wg_workshops/2023-03-29-march-meeting/project-systematic-data-risk-classification.md index f99fc60..e1cae34 100644 --- a/docs/events/wg_workshops/2023-03-29-march-meeting/project-systematic-data-risk-classification.md +++ b/docs/events/wg_workshops/2023-03-29-march-meeting/project-systematic-data-risk-classification.md @@ -19,7 +19,6 @@ _Chair: Will Crocombe (RISG Consulting)_ - 3 - weak pseudo - 4 - public - Dropping down tiers, things become easier. Turing paper on this - Sheffield used this as the basis of their system for assessing risk. -- https://zenodo.org/record/7754459 - [Alan Turing Institute paper](https://arxiv.org/pdf/1908.08737.pdf) - Importance of agreed risk classification with federation, and agreement on risk appetite - [NIST RMF](https://csrc.nist.gov/projects/risk-management/about-rmf) diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-ai-ml-llm.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-ai-ml-llm.md new file mode 100644 index 0000000..f70da52 --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-ai-ml-llm.md @@ -0,0 +1,75 @@ +# Current state of the art re data linkage/federation/AI&ML&LLM across infrastructures: federation, governance, safe output methods + +## Overview + +### Summary + +Issues about federation of datasets were discussed, including identifying different datasets across multiple systems, how to collect identifiable information robustly, and how we can link up different approaches across the 4 nations effectively. + +There was further discussion on how to effectively check ML models within TREs. + +In the case of governance, it was suggested that a project working across multiple TREs should have one singular governance process. + +### Next steps + +- Create a 'panel' focused on specific type of data/research (e.g. health, crime, financial) who can oversee specific research projects within these fields + +## Raw notes + +### Data Linkage + +#### How do you go about the NHS Number? + +- Uses NHS Standard NF5, after 3 they went to manual to track through the system. +- Issues with health and non-health data + +#### Names such as Dave / David can cause problems. + +- Linksmart is a solution for this. +- Collecting Crime Data + +#### Scotland's Approach + +- a national ID number + +### Federation between datasets + +- Identifying with confidence across TREs is important +- Problem: Linking health with something else is problematic to match up and link it with addresses and names +- Separation functions +- Person has all the identifying information, but they do not have the data +- TREs communications between each other need specific criteria, Scotland has 5 TREs +- Having more than two, and introducing a central one is a possibility +- Issues with identifying A-B data sets across multiple systems +- Seeding Death Data -- David and Debra Smith: D. Smith & D. Smith causes gender incompatibility issues +- National Drug Treatment Data -- At source they only collected initials 'D.S.', Gender and MM/YYYY of DOB. Deidentifying can cause linking problems. Education to non-education where they don't have their common 'number' -- how confident can we be that Participant A is the same participant in another TRE? If you're not sharing names & addresses +- Bringing in NHS data and also pseudo anonymise it -- how can you work with it without a key? +- Once you got a data linkage -- bringing the different data types into a data set (TRE). E.g. Linking mental health data and shopping data, if you anonymise that and have their own key -- they can do it anonymously for external sources +- Education data between England, Scotland and Wales might use different notations +- Residential Data can be used as a key +- 'E-child' trying to link the NHS with the Department of Education + +### AI & ML + +- People misunderstand the terms AI & ML with 'Statistical Modeling' +- Based on risk factors you can determine 70% precision pre-diabetic chance +- Accessing 'clinical like data' with similar terminology to mimic clinic systems +- AI -- Offline AI: you can have an offline machine learning model -- yes +- Would multiple AIs learn the same thing on same data sets? -- no +- You can make it work with a shared API though (Stroke Predicition) +- APRs -- 8-9 expensive centre +- Different type of interpretation of ML, ML data on health 'takes your job', ML data on other scenarios might be socially acceptable +- Pattern finding models are popular and precise, this is lacking in statistical modeling +- At the end of the day, medical data ML is not understood why it gives that result +- Checking models are problematic and difficult, unsure results and unsure contents of the model begs the question of the model's authenticity + +### Governance + +- Process is repeated a lot, no committee talks to each other and are a separate entity +- Cannot start work unless approved +- Doing a project between TREs, each TRE will have an approval process, ideally a multi TRE Project requires a single approval process, this decision should be approved across the other one + +#### What would a solution to this problem look like? + +- Current state of the art is the overarching question -- needs a TRE panel to decide what is state of the art +- Single 'panel' on a specialty (e.g. health, crime) who deal with specific projects, additionally members of the national TRE supervision diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-cloud-onprem.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-cloud-onprem.md new file mode 100644 index 0000000..f33a6cf --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-cloud-onprem.md @@ -0,0 +1,44 @@ +# Cloud vs on-prem TREs: costs, constraints, pros & cons + +## Overview + +### Summary + +The main decision drivers are security and cost. +Cloud is more flexible for projects with different funding sources and does not require an expensive data centre for research institutions but does not offer the highest levels of security. + +A potential solution is a hybrid model where you get a cloud-like infrastructure on an on-prem compute. + +Cloud provision via Jisc (as oppose to direct with the cloud provider) can be cheaper and it also handles SSO: https://www.jisc.ac.uk/forms/uk-access-management-federation-sign-up# +Resources: Google RADLab: https://cloud.google.com/blog/topics/public-sector/googles-new-rad-lab-solution-helps-spin-cloud-projects-quickly-and-compliantly + +### Next steps + +- Develop a roadmap plan for a hybrid, cloud-agnostic model + +## Raw Notes + +- Compute capacity/ data centres for advanced ML projects is expensive for research institutions +- Credits make it easier to use cloud for projects with different funding sources +- Could a good solution be a hybrid model where you get a cloud-like infrastructure on an on-prem compute + - So could be completely disconnected from internet for high security + - Google have set something like this up at Sanger +- Factors determining on-prem vs cloud + - security + - cost +- Cloud provision via Jisc (as oppose to direct with the cloud provider) can be cheaper and it also handles SSO: https://www.jisc.ac.uk/forms/uk-access-management-federation-sign-up# +- Resources: Google RADLab: https://cloud.google.com/blog/topics/public-sector/googles-new-rad-lab-solution-helps-spin-cloud-projects-quickly-and-compliantly + +### Roadmap plan + +#### Questions + +- What would a solution to this problem look like? +- What resources would be needed (people, time, funds, infrastructure etc.)? +- How can this community support you in getting them? +- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively? + +#### Notes + +- hybrid model (see above) +- Solution that is cloud-agnostic and could also run on on-prem hardware diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-community-governance.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-community-governance.md new file mode 100644 index 0000000..42bb23a --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-community-governance.md @@ -0,0 +1,88 @@ +# Governance of the UK TRE Community + +## Overview + +### Summary + +The discussion centred about the purpose and governance of the community, trying to reach a balance between conveyors but still provide enough content and direction not to be an “empty” place. + +Universal selling point of UK-TRE: Diversity of audience, and pragmatism: people that are doing something. +Danger of just listening is you don’t share your existing knowledge of what will/won’t work. + +Should we put out position statements? Say things if you don’t like something? The community should reach a point where what we say is respected. +More powerful than individual submissions. + +What should UK-TRE do? +Be careful not to become just a bureaucratic institution that has some funding, people, writes reports. + +Maybe a network that feeds up to DARE/HDR/ADR? +USP would be it’s practical, diverse, not duplicative, ideal audience for people at top to bounce ideas off. +Proper focus groups would be much more expensive. + +Some funding for the community to organise meetings like this is needed. + +### Next steps + +- Secure funding for person time for the community +- Establish a steering group for the community + +## Raw Notes + +- UK-TRE: Aims, purposes, should it take on a political/advocacy role? +- NHS: already have their plans for Governance + - but looking promising so far +- Datapact: Part of Data saves lives policy + - Not policy, but saying how NHS will treat your data +- Don't want to force too much information on public: they'll think you're trying to hide something +- Public engagement: not just telling them what will happen, instead enable citizens to make policy decisions +- Interest in academia about what to do, waiting for NHS to give guidance +- UK-TRE should we lead, not just follow NHS + - Lead, provide input + - TREs are for much more than just healthcare data which NHS focusses on +- Universal selling point of UK-TRE: Diversity of audience, and pragmatism: people that are doing something +- Danger of just listening is you don't share your existing knowledge of what will/won't work +- Should put out position statements? Say things if you don't like something? The community should reach a point where what we say is respected. More powerful than individual submissions. +- Industry groups such as ABPI, BIO + - Provide inputs, write reports, represent a community and a voice +- Organisations need to sign up to show support + - Sign-up to UK-TRE? Or to position statements created by UK-TRE? + - E.g. IET (engineering professional institution) members can say what they're interested in on their profile. IET may respond to a Government consultation by asking members for input, and collating responses. +- Working groups/focus areas + - Needs resource/funding + - Does UKRI have something? + - Beyond UKRI, commercial? +- GA4GH: + - multiple levels of slices of funding + - 100s of organisations across 80 countries +- What should UK-TRE do? + - Be careful not to become just a bureaucratic institution that has some funding, people, writes reports. + - Balance + - Maybe a network that feeds up to DARE/HDR/ADR? + - USP would be it's practical, diverse, not duplicative, ideal audience for people at top to bounce ideas off + - Proper focus groups would be much more expensive + - Some funding for community to organise meetings like this + +### Roadmap plan + +#### Questions + +- What would a solution to this problem look like? + - Ensure meetings remain attractive, not too officious + - Lightning talks good, reduces duplication + - Networking opportunities + - Long lunch + - People willing to invest time to travel + - "Stir people up and let them go" + - Beach! 🏖️ + - No different from what we've got now + - More recognisable branding + - A home? What does "home" mean? + - A formal recognisable figurehead +- What resources would be needed (people, time, funds, infrastructure etc.)? + - Funding for someone to be a formal chair of UK-TRE + - Neutral funding for someone to run community, not funded directly by a single institution + - Maybe multiple people? E.g. coordinator, chair, community manager (junior/senior?), technical? + - Elected chairs to propose direction/funding? Probably too much. + - Instead have a steering committee +- How can this community support you in getting them? +- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively? diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-data-harmonisation.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-data-harmonisation.md new file mode 100644 index 0000000..3397c9d --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-data-harmonisation.md @@ -0,0 +1,20 @@ +# Addressing data harmonisation between different datasets: do TREs have a role? + +## Raw notes + +### Handwritten notes + +Transcripted by CMWG team + +Data+Analysis=Timely Processing + +- Harmonized/OMOPed +- TRE governanced barriers +- Reliability-validated? +- TRE role:cross project share +- DMOPin data sources & adding TRE Specific terms into main repositories +- Mapping tools +- TREs can delegate (CoConnect) + - Discovery + - Feasability +- Clinical input diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-multi-tre-analysis.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-multi-tre-analysis.md new file mode 100644 index 0000000..eb0fe8f --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-multi-tre-analysis.md @@ -0,0 +1,75 @@ +# Multi-TRE analysis: challenges, governance requirements, federation + +## Overview + +### Summary + +For multi-TRE analysis to work, there needs to be trust between TREs. +This first relies on a shared understanding of what exactly a TRE is and needs to do, and will thrive when SOPs, accreditations and governance methodologies are shared. This would also benefit from shared understandings and laguage around architecture, sensitivity tiering and more. + +The more different TREs there are, the more risk of variability and bad practice, which can affect an entire system of federation. +Public trust is a large concern, and concerted effort will need to be made to ensure the public buys into any federated system of TREs. + +### Next steps + +Next steps focused on the short, medium and long term: + +- **Short term**: Define what a TRE is with a PEST framework, a TRE Maturity Model, common language for sensitivity Tiers +- **Medium term**: Review archiectures for working between TREs, identify key roles and responsibilties in a federated landscape +- **Long term**: Focus on PPIE and public perception, how data is held and managed + +## Raw notes + +Take home message: Its not about the Technology. In fact the more TREs technically enabled the more risk that the TREs are not fit for purpose for true operation and not trusted for federation. Process and Responsibility => Trust + +### Roadmap and Next Steps + +**Short term**: understanding what we have + +- Define what is a TRE, wrt to **multiple** TREs within a PEST framework that highlights issues that are not just technical, for example includes the diversity of TRE models, the business models of TREs, where risk, responsibility and accountability lay, and includes certifiable PROCESS as a core pillar (shared SOPs). Multi-TREs require new Processes. +- Define a TRE Maturity Model that builds on above to develop a more objective model of TRUST, RISK and RESPONSIBILITY for inter-TRE data exchange. Could be used to assess, compare, and facilitate trust between TREs. +- A common language scale for the ‘tiers’ of TREs suitable for different levels of inter-TRE sensitivity. +- Identify and clarify PEST bottlenecks with examples + +**Medium Term**: shifting to newer ways + +- Review different architectures and processes for working between TREs +- What would be just enough with what we already have (e.g. 5SROCrate as m-TRE middleware using current processes) +- What m-TRE processes would we need to introduce +- The role of trusted intermediaries (brokers, federated analytics services) to take on risk and responsibility and reposition the Data Sharing Agreements. e.g global identity services linking identities and records, who takes responsibility? + +**Long Term**: radical shift + +- PPIE education outside the PPIE self selecting bubble to counter mistrust of government and conspiracy theory +- Expectation that data is owned by the NHS? +- Rethink of data holdings and services from Data Warehouses to Data Fabrics. + +### Notes + +- What is a TRE ? + - Are they always repositories for single datasets, popup TRE? + - Not always - many of the environments have multiple users and projects on top of the core dataset, through project-based access through VMs/virtual desktops. + - There is also a requirement for high performance computing for some datatypes (GPU for AI/imaging, workflows etc) +- Do we need federation? Can we avoid multiple TREs? + - Governance requirements vary between data classes - you may need TREs to meet each governance requirements. + - But each TRE is expensive to run, especially assurance, governance, data egress control. +- How do TREs know they can trust each other? + - When workflows have to be shared between environments, it is easier to share between those with similar accreditations - e.g ISO27001. + - Federating TREs requires interoperability at the process level, shared SOPs etc. + - There will be many TREs built from the technological parts - but if there are poorly run ones, they will damage the whole 'brand' and impact on all TRE operators. More TREs, more risk. + - A 'maturity model' could be used to assess, compare, and facilitate trust between TREs. + - Legal obligations on indvidual TRE providers act as a strong constraint on data sharing; but a common list of questions might help. +- Can we develop a new brokered distributed/federated anaytics model? + - We need a new model to allow this. + - TRE-FX type solutions need to be driven by TREs. + - Need a common language scale for the 'tiers' of TREs suitable for different levels of sensitivity. + - People need to query across datasets - there are few cases where you can answer the research question without linking identities and records. + - But a global identity connecting service would be a huge responsibility. +- How do we carry the public with us? + - Estonia have an opt-out system for health records, opt-in for genomics data; but when public confidence drops, opt-outs increase. + - Public perception of risk is a problem. + - In COVID, people were happy to share data. + - Even trust in NHS is not universal now... + - Education outside the TRE 'bubble' to counteract conspiracy theories etc. +- Do trusted data fabrics offer a different view? + - Networks of secure data services based on Enterprise data models. diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-nhs-sdes.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-nhs-sdes.md new file mode 100644 index 0000000..a4e1e4d --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-nhs-sdes.md @@ -0,0 +1,128 @@ +# The role of independent TRE providers in relation to the NHS national and regional SDEs + +## Overview + +### Summary + +This discussion made evident the multiplicity of current efforts and the difficulty to know what is happening in this space or the direction it is taking. + +The discussion identified many potential areas of work and collaboration, the use of NHS data held in SDEs via specific TREs can expand the utility of this data but requires a lot of coordination, not only between independent TREs but also across regions and institutions. +Challenges arise on how to make this coordination and alignment effective, reconcile different interests (commercial and public) and ensure public and clinicians' trust. + +The role of HDRUK and the UK TRE Community is seen as a positive influence. + +### Next steps + +- The best next step, before anything else, is to establish better connections with the NHS SDE network. + +## Raw notes + +### Prompts + +- What's your interest in this topic + - Interested to know how this will work + - Supporting some SNSDEs: how do we build on existing work? + - SNSDE project: want to work beyond NHS + - Commercial provider of TREs: where are we going? +- How are the national and regional SDEs planning to work with each other, and with other TREs? + - Southern consortium: Wessex, Bristol + - Standardisation on governance and data access + - Competition: need to be sustainable +- What does sustainability look like? + - Charge for private/commercial access +- What are the limitations of the national and regional SDEs? + - Combining datasets from different SDEs + - Federation? + - Distrust of other infrastructure solutions + - Duplicated governance requirements + - Follow-up of cohorts with people who might have moved between areas +- How can data linkage avoid reidentification? + - Section 251 means this can be allowed + - Consent is a moving process + - Influence on national data opt-out +- Should an accredited non-NHS TRE be allowed to hold NHS data? + - Use-case for holding pan-UK data + - Onus on the TRE to adhere to NHS standards + - Questions of public trust + - What's the business case that no other TRE could provide? +- What assurances could/should a TRE be providing to the NHS? + - Reduce number of TREs: worst ones will bring down confidence in others + - Duplicative of effort and money + +### Roadmap plan + +#### Questions + +- What would a solution to this problem look like? + - Uncertainty about what's going on + - Uncertainty about how to do future collaborative work + - NHS England working groups exist: a bit siloed + - HDR UK doing a good job + - Why are there SNSDEs at all? + - Access to primary care data + - Start from different starting points + - What happened to the NHS Data Pact? +- What resources would be needed (people, time, funds, infrastructure etc.)? + - Need to interact better with the NHS e.g. on researcher requirements +- How can this community support you in getting them? + - Awareness + - Collate information about who's doing what, try to avoid duplication. + - Contact person for researchers working on independent TREs +- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively? + - HDR UK + - Each individual SDE + - Southern consortium of SDEs + - NHSE in flux: National SDE: Michael Chapman + +### Handwritten notes from day + +Potential areas + +1. Building upon successes; learnings of TREs for SDEs +2. Remit of NHS and other data sources for SDEs +3. What it means to continue to manage Independent TREs (and to build solutions for TREs) +4. Addressing data silos as well as cultural components to evolve TREs/SDEs +5. Data exhange opportunities and the potential for increased NHS data in existing TREs +6. Guiding principles for integration across TREs +7. National (English) GP data access oportunity +8. Cohesive strategies/approaches with SDEs launches +9. Potential roles for other stakeholders and access to open source assets +10. Aligning to standards/protocols for data access, designing commercial & sustainability cases +11. Outstanding questions for data controllers vs data providers +12. SDEs: variety of build, buy supplier decisions underway +13. SDE potential limitations: reconciling differences across regions based on decentralized protocols (eg. neutral grant for data); governance challenges-risk of redundancies, inefficiencies, slow decision making; need for coordinated mechanism for cross-SDF initiatives +14. Big question about incentives: collaborative vs competitive mindest. Clarity on who is tackling what, and how to coordinate effectively. difficulty introducing several new entitites at once, at pace +15. Dual focus on local stakeholder engagement/approval alongside central policy development. Specific focus on touchpoints (eg. de-ID then re-ID) and how to manage at scale- a la section 251 +16. setting expectations on the level of effort/amount of time needed for data linkage across several organisations +17. PPIE: aligning on short term project, mid/long term vision & value, and how to sustain + grow public trust in data usage for R&D (lessons learnt from Our Future Health?) +18. Impact of optin/opt out policies + level of participation: current state, future of TREs/SDEs and FDP ambitions (consider impact of coworking and misrepresentation of activity/outcomes) +19. Non-NHS TREs & ability to host data (health + otherwise). Would need to comply w/ national + SDF specific policies. Consent. Complicated issue esp for longitudinal datasets, common to point at 251 in England +20. Separating technical supplier managing TRE and ensuring data governance always remains in-house ie. with NHS organisations of independent providers of TREs. How to balance commercial activity+public trust +21. Leading with a business case for each TRE: what it offers that others don't, or can't. Need to convince public + stakeholders orgs. Implications for single-use TREs vs Single TRE that persists across many use cases +22. risk of systematised data collection + analysis leads to clinican performance management and overhead ("professional mistrust") +23. Main challenges: level of uncertainty, lack of policies + structures to collaborate effectively, early development of the NHSE/SNSDE Research Working Group. must PPIE for SNSDE direct to local approach; clarity on goal of SNSDE (eg GP data;imaging;trust level;-omics data) +24. What's workign well: role of HDRUK / UK TRE Community + +Roadmap + +1. Vision + strategy +2. Common principles, protocols, assets +3. Expanded communities of practice + knowledge share opportunities -> Consider national, regional, local links +4. PPIE approach + building trust/confidence with the clinical communities + +Workpackages/What would be helpful: + +1. Clear vision/value story on why TRE+SNSDE add/evolves +2. One pager on key protocols, ways of working + frameworks to strengthen consistency of messaging +3. Alignment of related data programmes (eg R+D vs FDP) +4. Community of practice + shared assets/lessons/insights so SDEs build on TRE success to date +5. User (eg. researcher) assets: needs, goals, decisions, pain points, requirements + +Resources +Consider where expertise sits across: +A. Independent TREs +B. National influences +C. National SDE [RN] +D. SNSDEs +E. Local researchers +F. Common entities/stakeholders in health data space diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-ppie.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-ppie.md new file mode 100644 index 0000000..2aa8dc8 --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-ppie.md @@ -0,0 +1,44 @@ +# Community-based efforts and collaborations in public involvement and engagement + +## Overview + +### Summary + +This discussion was an informative discussion that mostly gave the opportunity for attendees to ask PPIE experts about their experience so far. + +PEDRI, in which DARE UK is involved, was mentioned as an ongoing reference effort and the group discussed whether or not the community is currently doing enough and the best way to do more. +It is necessary to include PPIE in funding, make efforts to simplify language and ensure an impact loop of public panels (make sure participants learn about what they contributed to). + +### Next steps + +- Establish what groups are already working on this, and find ways to collaborate effectively with them. + +## Raw Notes + +- Introduction to the new stage of PEDRI which DARE UK are involved with +- Question of how public can be involved in researchers questions and advocacy +- Examples of how people can be involved in the process +- How do you attract the right people to be part of the process? +- PEDRI brings right PIE people together to bring the views of public +- Funding needed to recruit people to panels etc +- Question raised around education and how that helps with public involvement +- What is the role of festivals and education and informal education? is the TRE community doing enough? +- How difficult is it to make this work understandable at a public level? Think it is possible but are we focusing on that and being creative? +- Impact loop of public panels, make sure we feedback to them the impact this has had +- Should there be PPIE development across all organisations including academics, range of workshops or engaging with public? +- Should we keep to a high level or the technical side of it. Benefits to both types of involvement. Who would decide what is taken to them? +- Who should lead the public panels, if they are lead by TREs then do they not have an agenda or drive the conversations +- Challenges of four nation approach when nations sometimes do a specific approach +- How do we manage 'experienced public' should there be terms? Could they have another role? + +### Roadmap plan + +- What resources would be needed (people, time, funds, infrastructure etc.)? + - Funding to recruit members of the public who might not normally get involved. Examples of using Sortition and Coal Rabie and IPSOS + - Utilising that expertise of external recruitment agencies + - Training and support ofhow to communicate with members of the public for academics and 'technicians' +- How can this community support you in getting them? + - TRE specific PIE groups + - Embedding PIE skills in peoples careers +- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively? + - PEDRI diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-python-r-package-import.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-python-r-package-import.md new file mode 100644 index 0000000..afe9be8 --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-python-r-package-import.md @@ -0,0 +1,50 @@ +# Safety and security of Python and R package import into TREs + +## Overview + +### Summary + +Currently TREs allow access to PyPI and CRAN for less-sensitive data but only specific packages for more sensitive data. +Yet there are a variety of current approaches (some TREs have CRAN access while others do not). +Even though there are controls if there was a malicious python/R package, you could still just write the same thing inside the environment. +It is challenging to establish the line between R & Python files and AI/ML models. + +Regarding egress there are challenges around the labour intensiveness of it, for which there are some automated tools. + +### Next Steps + +- Collaborate on a shared allowlist/blocklist for packages + +## Raw Notes + +- Current TREs allow access to PyPI and CRAN for less-sensitive data but only specific packages for more sensitive data. +- Different people have different experiences. Some have no access to CRAN others do + - Scottish safe haven - no CRAN access + - Dundee & GM allow full CRAN acceess +- CRAN have a fairly strict pipeline for adding packages so can be trusted? + - but perhaps just coding standards rather than pen testing, file system access etc. +- If can lockdown egress sufficiently does it matter? + - also need to ensure things like file access, network access etc are prohibited + - can this be done? +- Is there a difference between R & python files, and a large ai/ml model? Not sure there's a clear dividing line of things we allow, and things we don't +- R has a system command to allow executing arbitrary code +- If there was a malicious python/R package you could just write it inside the environment - so preventing access to libs makes it harder but not impossible to do bad things. + +### Egress + +- Disclosure control labour intensive +- Some talk of automated tools +- Can prevent accidental disclosure +- What about malicious attempts to extract data e.g. encrypted, embedded in image files, in binary models etc. +- File size potentially helps + - E.g. plausible to extract small amounts of patient data in an encrypted way that passes disclosure control. But unlikely you could do that with 1000s of records + +### Roadmap plan + +- Is it possible to lock down a TRE sufficiently so it is possible to allow unlimited ingress? If so best solution as no friction for researchers. Also allows future ingress items such as LLMs / neural nets etc.. +- If not, then can TREs collaborate to whitelist (and blacklist) packages to prevent each one needing to repeat work. + - Central register / co-ordination + - But what to do about versioning? +- Could have a dual model: + 1. Docker based containerised TREs that are completely locked down meaning that any ingress is allowed + 2. TREs with a list of packages that are allowed, and you need to just use those. Process to request new packages diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-satre.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-satre.md new file mode 100644 index 0000000..4715857 --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-satre.md @@ -0,0 +1,84 @@ +# Future governance of the SATRE Specification + +## Overview + +### Summary + +SATRE funding ending in October but planning to continue work on the specification, the aim is to be community owned but what the governance actually looks like is uncertain. +SATRE aims to be between high-level accreditations (CE+, ISO27001) and the low-level detail of a particular implementation and include demonstrations of how TREs are meeting it. + +### Next steps + +- The next steps seems to need to be socialising the specification and building a peer network. + +## Raw notes + +- Current funding ends at the end of October +- Planning to continue working on the specification + - How best to fund this? + - How to keep the community involved, using, contributing? +- What does the governance look like in this community owned future + - A Foundation (e.g. Mozilla)? + - W3C? +- Will SATRE create a 'standard template' + - Aiming to be between high-level accreditations (CE+, ISO27001) and the low-level detail of a particular implementation + - Example evaluations for two existing TREs + - Demonstration of TREs meeting the standard +- Does SATRE recommend particular tools + - Not specifically, focuses on capabilities that a TRE must provide rather than risking taking divisive positions on particular packages _etc._ + - Future scope for taking modular, design elements from TRE implementations and sharing these. Mapping of these elements to SATRE capabilities. +- Does SATRE cover who operates a TRE or what they need to do? + - Roles are defined and used to build requirements +- Expecting community to cross audit each other? Teams may lack resource to audit themselves + - Not a plan at the moment. Auditing could be part of SATRE in the future if there was a need. +- Socialising the output seems important + - Making people aware of SATRE, building familiarity + - Important to do this before the end of SATRE? + - Could be the next phase + - Identify who is engaging with the specification and what they need. _E.g._ + - Help evaluation + - Building a peer network of SATRE 'users' + +### Roadmap plan + +#### Questions + +- What would a solution to this problem look like? +- What resources would be needed (people, time, funds, infrastructure etc.)? +- How can this community support you in getting them? +- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively? + +#### Notes + +- Identify the community and what they need. + - This becomes the targets of the next phase of SATRE. + - Could be + - Peer network + - Auditing/evaluation support +- Organise networks around the pillars + - May help coordinate/focus effort +- Identify contribution mechanism, consensus mechanism +- What would SATRE require to have confidence? + - Part of the HDR UK innovation portal + - Endorsement from highly regarded, trusted bodies, for example, HDR UK, UK SeRP, ADR UK, ... + - Clear mapping, roadmap to ISO27001 + - Clear guidance on roles _including_ expected time and skills for that role holder. Avoid TRE staff being overloaded or given unreasonable tasks + - Too much of an imposition? Too specific? + - Guidance on the economics of TREs + - Build your own + - Buy an off-the-shelf solution + - Cloud vs. On-prem + - People costs +- Identify how to fund staff + - First 'round' was DARE UK + - More resources from funders, _e.g._ HDR UK +- What should a dedicated SATRE person do? + - Promotion + - Stewardship of the standard + - Community manager + - User support/outreach + - Engagement with other communities, _e.g._ SDAP +- Stability of funding + - Research funding is not guaranteed + - Ask for money/people donations from SATRE users + - Fee for formal accreditation against SATRE diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-sessions.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-sessions.md index 808447a..6400465 100644 --- a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-sessions.md +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-sessions.md @@ -1,29 +1,25 @@ # Breakout sessions -We are planning several breakout sessions on topics suggested by attendees, with topics chosen by attendees on the day via poll. +```{toctree} +:maxdepth: 1 +:hidden: true -## Suggested topics - -Some topic suggestions have already been submitted: - -- Multi-TRE analysis: challenges, governance requirements, federation. -- Current state of the art re data linkage/federation/AI&ML&LLM across infrastructures: federation, governance, safe output methods -- TRE sustainability and operations -- Safety and security of Python and R package import into TREs -- Community-based efforts and collaborations in public involvement and engagement -- Cloud vs on-prem TREs: costs, constraints, pros & cons -- The role of independent TRE providers in relation to the NHS national and regional SDEs -- The role of TRE standards for the TRE community -- Sight unseen: how far can we go with keeping data hidden from users? -- Addressing data harmonisation between different datasets: do TREs have a role? -- Access and rights metadata to support process choreography and interoperability (e.g. generic researcher applications and authorization) - -Additional topics can be suggested via [this form](https://docs.google.com/forms/d/e/1FAIpQLSes2N38KjElxgfXXDUEVBeQ5g8qvneQS9Gr4alZki5s_4RRew/viewform?usp=sf_link), or during the day itself. +breakout-ai-ml-llm +breakout-cloud-onprem +breakout-community-governance +breakout-data-harmonisation +breakout-multi-tre-analysis +breakout-nhs-sdes +breakout-ppie +breakout-python-r-package-import +breakout-satre +breakout-sight-unseen +breakout-standardisation-tensions +breakout-tre-sustainability-and-operations +``` ## Breakout session format -Each breakout session will be one hour long, with multiple parallel topics decided by a vote prior to the session: - ```{list-table} :widths: 100 300 :header-rows: 0 @@ -42,8 +38,35 @@ Each breakout session will be one hour long, with multiple parallel topics decid - Wrap up: Breakout groups present their next steps ``` -We will attempt to cover as many topics as possible, and attendees can choose which breakout session to go to. -Popular topics will be covered in both sessions, giving participants the options of going much more in-depth. +## Breakout room notes + +Each breakout session had 12 breakout discussions. +In the notes below you will find a link to both a summary of the session and the main suggested next steps, and the raw notes taken on the day. + +**Important note**: The summaries and raw notes are simply what was discussed on the day, and in no way represent any explicit views of priorities of the UK TRE Community. + +:1: [](breakout-multi-tre-analysis.md) +:2+3: [](./breakout-ai-ml-llm.md) +:4: [](./breakout-tre-sustainability-and-operations.md) +:5: [](./breakout-python-r-package-import.md) +:6+7: [](./breakout-ppie.md) +:8: [](./breakout-cloud-onprem.md) +:9: [](./breakout-nhs-sdes.md) +:10: [](./breakout-satre.md) +:11: [](./breakout-standardisation-tensions.md) +:12: General discussion - No notes + +## Breakout Rooms Session 2 -The aim of these breakouts is to discuss the topic area, and explore how the community could take work forwards from the discussion. -At the end of this session we would like each group to summarise what next steps would look like: what they want to achieve, what resources would be needed, and how they could go about it. +:1: General discussion - no notes +:2: [](./breakout-data-harmonisation.md) +:3: [](./breakout-sight-unseen.md) +:4: [](./breakout-tre-sustainability-and-operations.md) +:5: [](./breakout-python-r-package-import.md) +:6: [](./breakout-cloud-onprem.md) +:7: [](./breakout-community-governance.md) +:8: General discussion - no notes +:9: [](./breakout-nhs-sdes.md) +:10: General discussion - no notes +:11: [](./breakout-standardisation-tensions.md) +:12: General discussion - no notes diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-sight-unseen.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-sight-unseen.md new file mode 100644 index 0000000..b82b090 --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-sight-unseen.md @@ -0,0 +1,46 @@ +# Sight unseen: how far can we go with keeping data hidden from users? + +## Overview + +### Summary + +This is the model of [OpenSAFELY](https://www.opensafely.org/). +Questions explored were how to ensure that the provided metadata is sufficient, how to extend the approach to more complex data (highly relational/linked databases) and the implied need of code review before running on actual data. + +In summary this can be done but there are limitations. + +## Raw notes + +- What are the advantages and disadvantages of hiding data from users? +- How do we minimise barriers and frustration when working with unseen data? +- Pros and cons of hiding data. Is it even worth it? +- Challenge with interpretting the question - is this about restricting just identifiable information? +- In what scenario would it be beneficial to keep data hidden? + +- Federated analytics - [OpenSAFELY](https://www.opensafely.org) model. Allows you to see data that is structured the same as the original but filled with random (synthesised?) data. + - Can we provide sufficient metadata to allow for unclean or missing data? + - Additional challenge with more complex data (highly relational/linked databases) + - There is a need for code review before running on the original data +- Who's resposibility is it to create the metadata and do the cleaning? The data provider? The TRE (probably not)? +- On the question of how far we can take this: + - It can be possible, but there are limitations. Including reducing the chance of the results. +- Pros of hiding data: + - increase trust in research + - potential for higher quality research (no p-hacking, more hypothesis testing, less data mining, etc) +- There are some doubts about the value/need for this. Aren't TREs with anonymised data enough? + +### Roadmap plan + +#### Questions + +- What would a solution to this problem look like? +- What resources would be needed (people, time, funds, infrastructure etc.)? +- How can this community support you in getting them? +- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively? + +#### Notes + +- Something along the lines of the OpenSAFELY model could work +- Requires trust in the data providers and researchers +- Limitations of types of data and types of analyses +- Resources required: people to do the code review step diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-standardisation-tensions.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-standardisation-tensions.md new file mode 100644 index 0000000..1cdc95a --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-standardisation-tensions.md @@ -0,0 +1,84 @@ +# What are the detractors to standardisation and working together? How can the reasons for the tension work alongside standardisation? + +## Overview + +### Summary + +Everyone loves a standard as long as it’s theirs: because implementing standards = real work on operational systems, everyone has an interest in keeping the standard as close as possible to what they already have. +Rationalising these different approaches needs space and time to coordinate joint R&D activity that is separate from, but connected to, actual operations. + +We also need to recognise that coordinating across multiple TRE providers will (a) be slower than operational timescales, and (b) need dedicated people who are not trying to run ops at the same time. + +Currently, R&D is typically funded by competetive grants to “innovate”, and operational expenses are often top-sliced from these grants. +Separating operational funding from innovative R&D grants is one thing that would help. + +So: separating ops teams from R&D teams in both people and funding terms is the biggest single help. + +### Next Steps + +- Funding model evolution +- Speed date - understand who/can collaborate with for mutual benifit +- Value of this activity seen by organisation / funders +- + +## Raw notes + +- Cant see the wood for the TREs :-) +- Changing in flight - how to manage updates to existing services / platforms in use +- No ability to translate research into applied use of TRE things. Devops models and supporting +- Data controllers risk appetite and policies needing alignment against something common +- Consensual management for the person : How does the citizen become involved in the TRE across TREs. +- Inertia and reluctance to change their worlds towards a harmonised picture - Budgets and funding to do plus stick. +- Purpose is lost as to what a TRE SDE etc is. +- No clear definitions of these and separations of their function and facilities. +- Operational funding - top slicing / core funding thats not funded to innovation +- Funding !!! +- Academic model. +- Knowledge about who to collaborate with / capabilities / specialisms +- Job Role recognition, value of contribution to whole machine +- Governance is the biggest effort +- UK Wide program to define governance framnework. +- Innovation must still be supported + +### Handwritten notes on day + +Transcripted by CMWG + +- Incompatible standards? + - SDE/TRE/Safe Haven + - Organic growth + - Legacy debt (inertia) +- Clarity of language +- Follows the funding +- Capacity for change management + - Service environment often R&D (run it and improve it) +- Bridge: research prototype -> 'Product' (i.e. TRL 1 -> 9) +- Risk definition & tiering -> No standard? + + - DEA? + - Legislate it + +- Lots of standards arise for good reasons + - had to exist, so grew in isolation +- Have lost sight (perhaps) of the "why" are we doing this +- Existing inertia (changing engines when plane is in flight) +- Ops is funded one way, R&D funded by very diff methods, there is no clear bridge from one to the others +- Differing risk appetites from DCs, often for "poor" reasons + +### Roadmap plan + +#### Questions + +- What would a solution to this problem look like? +- What resources would be needed (people, time, funds, infrastructure etc.)? +- How can this community support you in getting them? +- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively? + +#### Summary + +- Everyone loves a standard as long as it's theirs: because implementing standards = real work on operational systems, everyone has an interest in keeping the standard as close as possible to what they already have. +- Rationalising these different approaches needs space and time to coordinate joint R&D activity that is separate from, but connected to, actual operations. + - We also need to recognise that coordinating across multiple TRE providers will (a) be slower than operational timescales, and (b) need dedicated people who are not trying to run ops at the same time. +- Currently, R&D is typically funded by competitive grants to "innovate", and operational expenses are often top-sliced from these grants. + - Separating operational funding from innovative R&D grants is one thing that would help. +- So: separating ops teams from R&D teams in both people and funding terms is the biggest single help. diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-tre-sustainability-and-operations.md b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-tre-sustainability-and-operations.md new file mode 100644 index 0000000..52c875f --- /dev/null +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/breakout-tre-sustainability-and-operations.md @@ -0,0 +1,253 @@ +# TRE sustainability and operations + +## Overview + +### Summary + +Sustainability needs to be long term, but how do you plan for it when the scenario may change in 5 years? +There is also an issue with research, this is a service yet funding requires teams to appear to be doing something new each time, and funders often prefer not to pay for infrastructure (also challenges with cost estimates and under/over expenditures). + +There are several variables and questions about whether they should be free at point of use (distributing against overheads), or whether to employ a membership user model, a project fee model, standard features being free but charging for high demanding ones or something else. +In all cases at least some core funding is required to ensure continuity, specialisation and quality. + +What we want to ensure is that a public service exists. + +### Next Steps + +- Create a roadmap that focuses on: + - Technical skillsets + - Information governance requirements + - 10 year funding plan + +## Raw notes + +Sustainability from funding perspective beyond the initial 5 years + +- But what are things going to look like in 5 years time + +CL centrally funded model + +- Service in place, refreshed but need to appear to do something different each time to secure funding. + +**Why different?** + +- How costing then? Free at point of use, cost distributed against overheads. +- Constrain in the cloud? + +Barts recover work space costs from research projects, distributed central cost on a membership/license/user model + +- Difference between model for internal and external users. + +Standard provision free, high storage/compute needs to be recovered + +- More paperwork to create and chase invoices. + +no funders like paying for infrastructure + +What counts as core if it was funded? + +- Duties imposed as data controllers law, or interpretation runs counter to wants of researchers + +Folk specialising, if it doesn't get funded for the future that capability is lost. + +Regional SDE model might lead the way of costing-funding-recovery + +Some central funding + +Specialist areas - operational team + +- Different environments work differently from researcher perspective + +Sustain people + +Business and operations to use OS TRE safely and securely + +what is the perfect TRE/SDE environment future consolidation + +Software development can be amortised across the community + +SERP tenant + +Training component + +Who provides desk-side support + +Tracking usage, egress process, layers of tools and processes that need to be in place + +In/out nature of TRE, tiered sensitivity? Commercial sensitivity. Has auditability in the TRE, does it need to be? + +- Why different for UCL TRE? + +Difference in TRE makes funding case easier, adding something new made it more interesting. + +Using research funding to backfill + +Estimate in advance what project is likely to use, operational costs, usually completely wrong and go over project + +- Not sustainable to go consistently over budget +- Bill after usage is best, but challenging for proposal/funding + +Cliff edge, have funding but only sufficient for 1 year not 3 years of project. + +Following Access to HPC model + +What can you take off the board if problem is solved strategically + +- Good training for Data scientists: SC like training relevant to disciplines + +Seems like we're trying to boil the ocean + +- VDI, Excel may be R, Stata +- Developing things to deal with core use case + +Core capabilities, exceptional stuff is great, but majority, early stage users, standardise and simplify. + +Whatever it is, what's missing the ability to understand data. GIGO + +Standardisation of data makes it seem simpler than it is, reproducibility? + +AI/ML store data for XX years, is it readable in that time? + +Who picks up the storage costs for the data. + +Guidance + +How can we make it more transparent + +Constrained with the current model. + +Guidance provided by RCs, institutional risk as the org have underwritten the project. + +This breakout room continued during the second round + +Concerned about being able to provide a service, don't control budgets + +- Sustainability of providing a public service, rather than generating a business case + +SNSDE comes under DH budgets, makes things easier + +HDRUK MRC led 20 year vision 5 year cycle + +- UKBB core underpinning funding +- Fund TREs for 3-5 years for specific projects +- Specific use cases not currently supported +- Individual researchers and work with them and the RO. + +- Free at the point of use funding? +- Provide underpinning capacity? + +What is ONS Model? + +- Free at point of access +- Don't know how the budget is secured +- Funding comes through different sources ADR UK +- Research proposal, existing staff funding or contracted. +- For commercial and public researchers usage has to be for public good, commit to publishing and not for profit +- Virtual machines provided some policy for standardising storage/compute available +- Trying to enable research + +Driven by what researchers ask for + +- Intrinsic limit on budget call +- Budget for a specific network/platform +- Leverage external investment +- Some Pharma match funding +- Universities also fund + +Move to long term funding + +- Strategic level of funding, buffered from long-term budget +- Hub large funding but cliff-edged + +Free at the point of use + +- Incentivised-disinsentivised, equity of access +- Power users can over-consume, less accountability not having to justify use + +consuming data token publication and harvesting data for private use + +- Free at point of access so data is freely accessible +- Reminder: Don't offer data for commercial use + +Challenges: + +- Ingress-egress labour intensive to pour human eyes +- Automation tools for validating statistical disclosure test +- Skilled job +- Tools and more people-more efficient tools; more people would always be good. +- All TREs have these issues, share the solutions + +More automation -IDS (Integrated Data Service- SRS Secure Research Service + +- Free at point of use?? Cuts out some of the applications automated validation of inputs + +Understand the whole pathway + +- Fix one part and it just shows the next bottleneck +- Fraunhoffer 1/3-1/3-1/3 lights_on-academic-commercial_activity +- Sustainability, prime an initiative without committing to long term investment + +More people - more monkeys on typewriters + +Over focus on the medical use case currently, needs to rebalance. + +Better understanding and economy of scale from small numbers. + +- Focus critical mass on small number +- DARE UK would create a TRE to handle data as an offering + +What is a TRE? + +- At what point does a federated TRE network become a single TRE? +- TT: At the point at which you have seamless transition between TREs? + +Trust that the analysis/code is running as intended? + +### Roadmap plan + +#### Questions + +- What would a solution to this problem look like? +- What resources would be needed (people, time, funds, infrastructure etc.)? +- How can this community support you in getting them? +- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively? + +#### Roadmap: + +A roadmap should address + +- Technical knowledge, skills, TRE staff skillsets + + - Why doing this has to be part of retaining people + - Localising staff makes this easier, central models push more to thinking about pay + - To address retention + - Pipeline of talent + - Can TRE model work in R + +- Not just technical, IG, where can I get more information + + - Consultancy + - Embedded technical/operational/IG knowledge relevant to the problem. + - Research - teaching balance. + +- Funding + + - Lots of politics, in HPC communities, good for those who get it. Not good for those who have to resort to begging + - Not necessarily good for SDE + - Analysis will follow data + - People with data will need to bolt compute + - HPC allocation modelled SDE account for compute/storage costs + - Why should SDE and HPC be considered differently + + 10 year plan - scope for accreditation + + - Chartered research infrastructure? + - CSP platform neutral certifications for Data/Cloud + +Infrastructure sustainability + +People: + +- Infrastructure/Developers +- Operations +- Data Scientists diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/index.md b/docs/events/wg_workshops/2023-09-04-september-meeting/index.md index 751f1ce..b4baaa1 100644 --- a/docs/events/wg_workshops/2023-09-04-september-meeting/index.md +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/index.md @@ -1,7 +1,5 @@ # UK TRE Community September 2023 Meeting -## Theme: What is stopping TREs from working together? - ```{toctree} :maxdepth: 0 :hidden: true @@ -14,33 +12,36 @@ _The biggest meetup of the UK TRE community to date with presentations, breakout :Date: Monday 4th September 2023 10:00 - 17:30 :Location: [Swansea University Bay Campus](https://www.swansea.ac.uk/the-university/location/#bay-campus=is-expanded), Fabian Way, Skewen, SA1 8EN. -Morning session will also be online. -:Registration: https://www.eventbrite.com/e/uk-tre-community-september-meeting-tickets-676066472017 :Cost: Free to attend +:Theme: What is stopping TREs from working together? +:Recordings: https://www.youtube.com/channel/UCd7AZjyH33aCmIojGHMfqEg -### Background - -The UK TRE community is thrilled to announce what is hoped to be the UK's largest in-person meeting on Trusted Research Environments (TREs) to be held in Swansea on Monday 4th September 2023. -The day will bring together stakeholders from all sectors and disciplines to discuss the current state of TREs, look at existing barriers, and define the future provision of TREs in the UK. +## Background -The UK TRE Community is a community that has grown organically over the last year for anyone interested in TREs, including researchers, operators, information governors and managers and more, from all sectors and disciplines. +The UK TRE Community is a community that has grown organically since September 2022 for anyone interested in TREs, including researchers, operators, information governors and managers and more, from all sectors and disciplines. The core aims of fostering collaboration and sharing of innovative ideas to support the delivery of groundbreaking research with sensitive data have resonated across the UK and beyond. -### Meeting information +## Meeting information + +The theme of the meeting was how TREs can work together through the development of open standards, codebases and federation to enable sharing of technology and workflows, and encompassed a range of topics including the development of technical and governance standards for TREs, the major challenges of operating TREs today, and how the community can work together to address these and other common challenges. -The meeting will focus on how TREs can work together through the development of open standards, codebases and federation to enable sharing of technology and workflows, and will encompass a range of topics including the development of technical and governance standards for TREs, the major challenges of operating TREs today, and how the community can work together to address these and other common challenges. +The meeting was open to everyone, and was attended by a range of stakeholders including those involved in the day-to-day development and operations of TREs, those responsible for commissioning and funding TREs, and users of TREs from the health, administrative and industry sectors. -The meeting is open to everyone, and will be attended by a range of stakeholders including those involved in the day-to-day development and operations of TREs, those responsible for commissioning and funding TREs, and users of TREs from the health, administrative and industry sectors. +## Agenda -### Agenda +The day featured lightning talks from cross-sector research industry teams who are at the forefront of testing innovative ideas for TREs, and a keynote presentation on HDR UK’s plans for the tech eco-system. -The day will feature lightning talks from cross-sector research industry teams who are at the forefront of testing innovative ideas for TREs, and a keynote presentation on HDR UK’s plans for the tech eco-system. +A key part of the day was breakout sessions and discussions, which took advantage of the mix of sectors and stakeholders to learn from each other, evaluate the work already done, and define goals for the community and the future provision of TREs in the UK and beyond over the next few years. -A key part of the day will be breakout sessions and discussions where we will take advantage of the mix of sectors and stakeholders to learn from each other, evaluate the work already done, and define goals for the community and the future provision of TREs in the UK and beyond over the next few years. +This meeting was free to attend, thanks to the sponsorship by [The Alan Turing Institute](https://www.turing.ac.uk/), [HDR UK](https://www.hdruk.ac.uk/) and [DARE UK](https://dareuk.org.uk/). +The morning talks were broadcast online for those that could not attend in person, but all afternoon sessions including breakout discussions were in-person only. -This meeting is open to everyone and free to attend, thanks to the sponsorship by The Alan Turing Institute, HDR UK and DARE UK. The morning talks will also be broadcast online for those that cannot attend in person, but all afternoon sessions including breakout discussions will be in-person only. +This report summarises the sessions of the day, as laid out in the schedule below. -#### Schedule +Recordings of the morning session, including the Welcome, Keynote, and Lightning Talks, are [available on YouTube](https://www.youtube.com/channel/UCd7AZjyH33aCmIojGHMfqEg). +More detail of the breakout sessions can be found in their respective notes from the day, which are linked in each summary below. + +### Schedule ```{list-table} :widths: 100 300 @@ -65,68 +66,118 @@ This meeting is open to everyone and free to attend, thanks to the sponsorship b ``` -## Notes from the day +## Notes + +### Welcome and Introduction + +Hari Sood (The Alan Turing Institute) introduced the day. + +The intro focused on the four broad focus areas of the day: + +- What's stopping TREs from working together? +- What's actually happening in the TRE space? +- What can TRE teams do together? +- What support can the UK TRE Community provide to help TREs work together? + +It was highlighted that two great community outputs from the day would be: + +1. A **landscape** of the current organisations, communities and teams in the TRE space in the UK. +2. A **roadmap** of projects, work and tasks community members could undertake in the TRE space, including timeframes, required resources and workstream plans. + +Balint Stewart (DARE UK) also introduced the concept of [DARE UK Community Groups](https://dareuk.org.uk/our-work/dare-uk-community-groups/), and relevant funding opportunities. -- Key questions - - What's stopping TREs from working together? - - What's actuallu happenign in the TRE space? - - What can TRE teams do together? - - What support can the UK TRE Community provide to help TREs work together? - - (What's missing?) +The recording of the Welcome and Introduction can be found at the beginning of the [Keynote presentation recording](https://www.youtube.com/watch?v=ipfU4FdjYGM). ### Keynote -Can convening a technology ecosystem help TREs to work together? by Emily Jefferson (CTO, HDR UK) - -- Goal to make it easier for accredited researchers to access data across TREs. -- Four pillars: - - Technology services ecosystem - - Trust and transparency - - Usable data - - Capacity building (RSEs a key aspect of this) -- Driver programmes focussed on answering research questions but with a brief to do so in a way that contributes to the collective pool of technology and approaches -- Common challenges / pipeline elements across TREs / projects, but little re-use / shared use. - -- Convening a technology ecosystem - - Community, People, Beyond Health, Standards for Interoperability, Science of Infrastructure - - How can HDR UK support this? - - HDR's Technician Commitment - what's a better name? - - Recognising non-traditional careers including RSEs - - Supporting a wide range of impact - not just academic publications - - Technical solutions but for health need to work for non-health, too. People who work with sensitive data - health or not -> DARE UK - - Help us develop standards - OMOP, but what else? - - Innovative infrastructure which is making an important contribution - recognising impact. - - Technological Solutions - - Solutions from across the community - - HDR UK will also contribute some technological solutions supporting the ecosystem - all being open sourced. Not about mandating, but about having something available for all to use / build on. - - "The Gateway" to signpost and connect things across the community and include some common core elements. - - e.g. Metadata dictionary, Cohort discovery, Data discovery, Data use register, Data use requests - - Mk2 Gateway: Brandable by others, interoperable, modular, community developed, re-use existing solutions / components, driven by use case exemplars, open APIs, automated, scalable - -## Breakout Rooms Session 1 - -:1: Multi-TRE analysis: challenges, governance requirements, federation -:2+3: Current state of the art re data linkage/federation/AI&ML&LLM across infrastructures: federation, governance, safe output methods -:4: TRE sustainability and operations -:5: Safety and security of Python and R package import into TREs -:6+7: Community-based efforts and collaborations in public involvement and engagement -:8: Cloud vs on-prem TREs: costs, constraints, pros & cons -:9: The role of independent TRE providers in relation to the NHS national and regional SDEs -:10: Future governance of the SATRE Specification -:11: What are the detractors to standardisation and Supporting evolution of platforms / services -:12: General discussion - -## Breakout Rooms Session 2 - -:1: Access and rights metadata to support process choreography and interoperability (e.g. generic researcher applications and authorization) -:2: Addressing data harmonisation between different datasets: do TREs have a role? -:3: Sight unseen: how far can we go with keeping data hidden from users -:4: TRE sustainability and operations -:5: Safety and security of Python and R package import into TREs -:6: Cloud vs on-prem TREs: costs, constraints, pros & cons -:7: Governance of the UK TRE Community -:8: Accrediting people as well as environments… will developing "trusted researchers" allow us to do better/more/quicker data science? -:9: The role of independent TRE providers in relation to the NHS national and regional SDEs -:10: General discussion -:11: What are the detractors to standardisation and working together? How can the reasons for the tension work alongside standardisation? -:12: General discussion +:Title: Can convening a technology ecosystem help TREs to work together? +:Speaker: [Emily Jefferson (CTO, HDR UK)](https://www.hdruk.ac.uk/people/emily-jefferson/) +:Recording: https://www.youtube.com/watch?v=ipfU4FdjYGM&t=545s + +#### Introduction + +Emily's talk started by introducing HDR UK, and their primary goal of accelerating the researcher journey (finding data, accessing data, linking data, curating data, analysing data, creating insights and improving health) towards trustworthy use of data for public benefit. + +This is driven by the problem of trying to scale research projects to make it easy for accredited researchers to access and work with data from a range of sources, to enable studies with millions of people. + +Their approach has 3 main strands: + +1. Accelerating trustworthy data use +2. Empowering researchers +3. Promoting partnerships + +#### Technology + +The stages of a researcher journey were explored in more detail, focusing on: + +- Data Discovery +- Data Access +- Data Environment +- Data Analysis + +And how many different people have built many solutions across this journey. +In a lot of instances these solutions are quite different, meaning researchers have to use new processes, tools and methods when they go to different TREs. + +HDR is aiming to convene the technology ecosystem that we have in the UK. This focused on the aspects of: + +- Community +- People +- Solutions beyond health +- Standards for interoperability +- Science of infrastructure +- Technological solutions (driven by HDR UK and co-created with the community) + +HDR UK has 5 year window to focus on many technological components, with a team of over 60 distributed across HDR UK and partner academic institutions. + +Emily then focused on the HDR Gateway - a way to point out to different technical components across the TRE landscape. +Gateway (Mk1) has: + +- A Cohort Discovery Tool +- A Metadata Data Discovery Tool +- Data Access Request Form +- Data Use Register + +and more. + +Gateway (Mk2) will be much more modular, so the community can build it as an open-source solution. + +Emily highlighted some of the work already happening in the HDR UK landscape, including: + +- Federated analytics +- Phenotype library +- Prognostic Atlas +- BHF Data Science Centre +- Driver projects +- Hubs creating specialist resources + +The talk finished with a call to action to: + +- Talk to HDR UK about what they should include in this landscape, and work together with them on co-creating solutions. + +The session ended with a short Q&A, which can be found on the [recording of the session](https://www.youtube.com/watch?v=ipfU4FdjYGM&t=2686s). + +### Lightining Talks + +There were 26 lightning talks given on the day from across the community. + +To ensure everyone had a chance to present we randomly split the talks and the audience into two rooms to ensure as much mixing of ideas as possible. +All talks were recorded: + +- Room 1: https://www.youtube.com/watch?v=bn2ebeH4O6I +- Room 2: https://www.youtube.com/watch?v=tzW36NXvmsA + +All talks were a maximum of **5 minutes long**, with a focus on sharing knowledge and discovering opportunities for collaboration. + +For detail on the lightning talks, please visit the [lightning talks page](./lightning-talks.md). + +### Breakout sessions + +Breakout sessions were held on topics suggested by attendees, with specific topics to discuss chosen by attendees on the day via poll. + +Attendees could choose which breakout session to go to. +Popular topics were covered in both sessions, giving participants the option of going much more in-depth. + +The aim of these breakouts was to discuss the topic area, and explore how the community could take work forwards from the discussion. +At the end of this session each group summarised what next steps would look like: what they wanted to achieve, what resources would be needed, and how they could go about it. + +For detail and text summaries of the breakout sessions, please visit the [breakout sessions page](./breakout-sessions.md). diff --git a/docs/events/wg_workshops/2023-09-04-september-meeting/lightning-talks.md b/docs/events/wg_workshops/2023-09-04-september-meeting/lightning-talks.md index 27f88d8..4453469 100644 --- a/docs/events/wg_workshops/2023-09-04-september-meeting/lightning-talks.md +++ b/docs/events/wg_workshops/2023-09-04-september-meeting/lightning-talks.md @@ -1,39 +1,44 @@ # Lightning talks -26 people have submitted a lightning talk! -Unfortunately there isn't time for everyone to present in a single session, so to ensure everyone gets chance to present we will randomly split the talks and the audience into two rooms to ensure as much mixing of ideas as possible. -All talks will be recorded and made available online after the event, along with contact details for the speakers. - -All talks will be a maximum of **5 minutes long**, with a focus on sharing knowledge and discovering opportunities for collaboration - -## List of talks (ordered alphabetically by presenter name) - -| Presenter | Institution | Topic | -| -------------------------------- | ----------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Alexander Roberts | Swansea University | Implementing a TRE at Swansea - challenges and thoughts | -| Arlene Casey | DataLoch, University of Edinburgh | SARA (DARE lightning talk) | -| Carole Goble | The University of Manchester; ELIXIR-UK | The TRE-FX DARE sprint project on using ROCrates and workflows for federated analytics across TREs | -| Chris Appleton | AIMES | The use of Trusted Research Environments and different use cases, challenges and solutions to data ingest, storage and analysis | -| Chris Cole | Health Informatics Centre, University of Dundee | The DARE Driver Project on Standardised Architecture for Trusted Research Environments (SATRE) | -| Danny Silk | The Public Service Consultants (The PSC) | Three Principles to Drive Value from Sub-National Secure Data Environments (SNSDEs) | -| David Meredith | UKRI | Gathering TRE requirements with DARE-UK and our experiences with SMDH | -| Emma Squires | Dementias Platform UK | Supporting neuroimaging research within a Trusted Research Environment | -| Fatemeh Torabi | Swansea University | Multi-TRE analysis: challenges, governance requirements, federation | -| Harry Hamilton-Jennings | VWV | Commercialising technology developed through TRE data access | -| Helen Cadwallader | UK Data Service and UK Data Archive | The Safe Data Access Professionals (SDAP) network | -| Ibrahim Farah | Our Future Health | How Our Future Health built its first TRE | -| Ifeanyi Chukwu | University of Leeds | How we manage LASER - Leeds Analytics Secure Environment for Research, the TRE at Leeds Institute for Data Analytics, University of Leeds | -| James Robinson | The Alan Turing Institute | The Turing Institute Data Safe Haven | -| Jim Smith | University of the West of England | The SACRO project and how it can help TREs work together | -| Jonathan Tedds & Peter Maccallum | ELIXIR | A European Network for Trusted Research Environments | -| Katherine O'Sullivan | University of Aberdeen | The value of federated TREs: The Scottish Safe Haven Network | -| Oriol Canela-Xandri | University of Edinburgh; Omecu Ltd. | A current working prototype for truly federated data where a researcher can run, in real time, analysis on huge datasets (e.g. UKBiobank or larger) but without exposing the data at all to the researcher. | -| Pete Arnold | Swansea University | Training & Capacity Building for TREs | -| Peter Barnsley | Francis Crick Institute | Link Project TREs to Data SDEs - federate the data not the TRE project | -| Raymond Hounon & Rujuta Sanap | Google | TREs Best practices and Google Cloud capabilities | -| Rob Baxter | DARE UK | The current state of the DARE UK TRE federation architecture | -| Rob Heath | Microsoft | Microsoft's Trusted Research Environment capabilities | -| Seb Bacon | Bennett Institute | Using magic to find bugs in your TRE code (a.k.a. property-based testing for dummies) | -| Simon Thompson | Swansea University | DARE Teleport – Federated data to support team science | -| Susan Krueger | University of Dundee | The research question of: how to apply disclosure controls to trained machine learning models | -| Vivek Iyer | Wellcome Sanger Institute | The Genes and Health TRE in GCP | +:Format: 5 minute talks by members of the community on their work, and a call to action. +:Structure: Talks were split across 2 rooms, with 13 lightning talks in each room. There was no time allocated for Q&A. + +## List of talks + +### Room 1 + +:Recording: [Room 1 Lightning Talks recording](https://www.youtube.com/watch?v=bn2ebeH4O6I) + +| Name | Organisation | Talk | +| ----------------------------- | ----------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Alexander Roberts | Swansea University | Implementing a TRE at Swansea - challenges and thoughts | +| Ifeanyi Chukwu | University of Leeds | How we manage LASER - Leeds Analytics Secure Environment for Research, the TRE at Leeds Institute for Data Analytics, University of Leeds | +| Arlene Casey | DataLoch, University of Edinburgh | SARA (DARE lightning talk) | +| Chris Cole | Health Informatics Centre, University of Dundee | The DARE Driver Project on Standardised Architecture for Trusted Research Environments (SATRE) | +| Simon Thompson | Swansea University | DARE Teleport – Federated data to support team science | +| Katherine O’Sullivan | University of Aberdeen | The value of federated TREs: The Scottish Safe Haven Network | +| Peter Barnsley | Francis Crick Institute | Link Project TREs to Data SDEs - federate the data not the TRE project | +| Susan Krueger | University of Dundee | The research question of: how to apply disclosure controls to trained machine learning models | +| Emma Squires | Dementias Platform UK | Supporting neuroimaging research within a Trusted Research Environment | +| Seb Bacon | Bennett Institute | Using magic to find bugs in your TRE code (a.k.a. property-based testing for dummies) | +| Harry Hamilton-Jennings | VWV | Commercialising technology developed through TRE data access | +| Oriol Canela-Xandri | University of Edinburgh; Omecu Ltd. | A current working prototype for truly federated data where a researcher can run, in real time, analysis on huge datasets (e.g. UKBiobank or larger) but without exposing the data at all to the researcher | +| Raymond Hounon & Rujuta Sanap | Google | TREs Best practices and Google Cloud capabilities | + +### Room 2 + +:Recording: [Room 2 Lightning Talks recording](https://www.youtube.com/watch?v=tzW36NXvmsA) + +| Name | Organisation | Talk | +| --------------- | ---------------------------------------- | -------------------------------------------------------------------------------------------------- | +| David Meredith | UKRI | Gathering TRE requirements with DARE-UK and our experiences with SMDH | +| Peter Maccallum | ELIXIR | A European Network for Trusted Research Environments | +| Rob Baxter | DARE UK | The current state of the DARE UK TRE federation architecture | +| Pete Arnold | Swansea University | Training & Capacity Building for TREs | +| Fatemeh Torabi | Swansea University | Multi-TRE analysis: challenges, governance requirements, federation | +| Jim Smith | University of the West of England | The SACRO project and how it can help TREs work together | +| Carole Goble | The University of Manchester; ELIXIR-UK | The TRE-FX DARE sprint project on using ROCrates and workflows for federated analytics across TREs | +| Rob Heath | Microsoft | Microsoft’s Trusted Research Environment capabilities | +| James Robinson | The Alan Turing Institute | The Turing Institute Data Safe Haven | +| Ibrahim Farah | Our Future Health | How Our Future Health built its first TRE | +| Danny Silk | The Public Service Consultants (The PSC) | Three Principles to Drive Value from Sub-National Secure Data Environments (SNSDEs) | diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-local-cloud-hosting.md b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-local-cloud-hosting.md new file mode 100644 index 0000000..c87e066 --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-local-cloud-hosting.md @@ -0,0 +1,32 @@ +# Hosting TRE/SDEs locally vs cloud + +**Chair**: Adam Marsh (Optimum Patient Care) + +## Prompts + +- What is the split (if any) between TRE/SDEs being hosted locally vs cloud? +- What are the merits of either option? +- [assuming there are participants who have hosted locally] Are there any challenges or tips for building and hosting locally? + +## Summary + +Conversation centred around cost implications of local vs cloud TREs. + +Possible approaches included thinking about pan-cloud providers like Snowflake, as well as hybrid approaches to on-prem and cloud components, were discussed. + +Next steps included using UK TRE community to standardise investments in the space to reduce costs for TRE provision. + +## Raw notes + +- We all have the same cost management problems. All having to use limited budgets for balancing storage and compute and operations costs. But why does this infrastructure not exist nationally with national investment to maintain a single architecture for use? We should lobby up as a community to influence policy and makers. +- Separating SDE and TREs allows storage costs to be managed separately from the dynamics of compute costs. Having the flexibility to allow TREs for projects to pay for the compute is advantageous. +- Suggestions to look at using Snowflake or Data fabric to facilitate this +- Bringing compute to data (by allowing the project TREs to see the data in the SDE) allows the balance of forecasting to sit with the funded party (the research project via its funding) +- Forecasting costs as well as Operating costs are key. All plant will need refreshing every few years and this is a HUGE investment case. Scaling exponentially. Better handled by a national infrastrcuture provider. +- Decide on costs before choice of provider. Lock in to a single cloud is inevitable. Pan-Cloud providers (like Snowflake) offer a solution. +- Hybrid working with onprem and cloud may be a good balance, and is influenced by the exact use cases of the organisation (and its clients) +- Additional factor worth considering is flexibility to keep up with developments across the industry (i.e. what is the new shiny thing and being able to faciliate/provide access) + +### Next steps + +- Advocate for UK-TRE (or equivalent) should be lobbying 'up' to get standardised investment with the aim of reducing costs across the sector (not splitting funding across numerous initatives) diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-multi-tre.md b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-multi-tre.md new file mode 100644 index 0000000..7a60aa5 --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-multi-tre.md @@ -0,0 +1,65 @@ +# Multi-TRE analysis : can a common governance model breaks data silos? + +**Chair**: Fatemeh Torabi (Swansea University) + +## Prompts + +Work from Swansea Unversity has shown that major UK TREs are operating on different governance models ([Torabi et al. 2023](https://ijpds.org/article/view/2164)). +This breakout will discuss whether and how a common governance model can be achieved. + +- What percentage of the problem of data silos will be solved by a common governance model? +- What can be incorporated in a common governance model, and what can't? +- How do governance models need to be updated to work for a multi-TRE network, rather than just individual TREs? + +## Summary + +The [Swansea paper](https://ijpds.org/article/view/2164) was referenced, highlighting how it found that the problem of multi-TRE analysis can be addressed. + +The question was then asked - how do you get there, in a trustworthy and safe way? + +It was discussed whether an entirely new framework was needed, or whether existing arrangements can be adapted, and whether solutions should exist within TREs, or new TREs shuld be built. + +Many next steps were discussed, with a special focus on lobbying as a community to government, data providers and policy makers to move towards a more common, aligned approach to data provision and research. + +## Raw notes + +- Variation in data governance arrangements result in silos, barrier to federated analysis across data held across multiple TREs +- Review of TREs involved in federated collaborations (?) and their governance arrangement models: https://ijpds.org/article/view/2164 +- Conclusions of the paper: problem _can_ be addressed considerably through adoption of a common governance model +- How do you get there? +- Changing existing models and cultures is hard +- Participating with another legal entity the responsible party will still stay the responsible party +- In a joint partnership the new model has to distribute the risk across the partnership group +- MRC example by Paul Colville-Nash: accessing GPs in 4 nations, relationship between parties +- Infrastructure requirements can be quite different, standardisation would need to encompass these +- Old TREs have very similar model but different naming +- How do we get there? and how it all works? +- How do we think the world works? do we bring data to researchers or researchers to data? +- What feels more trustworthy from data owners view +- How safe are the existing TREs? +- How a structure can enable multi-TRE collaboration +- Are there examples where this has been done successfully? +- Use-case with rare disease where large sample size needed (but needed complex sharing agreements) +- COVID: federation was not solved +- Funder looking at global population level +- To what extent do we need a completely new framework, if changing existing arrangements is too hard? Ripping up existng governance arrangements carries risk of undermining trust. +- should we moving with faster speed in this space? getting any type of data takes time and we may need to let time solve things +- Funders would challenge establishment of a new TRE +- the wave is changing to establishment of TREs inside TREs with shared governance models +- the existing TRE is a free market, should we and do we want to introduce new 'TRE products' _within_ existing trusted TREs, that are designed around ability to federate/carry out multi-TRE analyses. That way data providers benefit from the experience of the established host TRE. This can enable a gradual migration from one governance model to another without needing to rip up entire TRE governance model (analogous to a software migration). Can start with small number of such TRE2.0 products and expand more organically (land and expand). +- traumatic brain injury hub is a secure dataset which is living in a separate hub within DPUK +- is this a tailored access model per dataset? We need to separate out the data set from the access and analysis of them - SDE vs TRE. And then allow the existing IG to see how the new approach is a small acceptable growth. +- TRE operators have to convince data providers not just to provide the data, but also to curate it and document it, repeated n times, all required before you can provide the data to a project. Onus on data owner not consistent, this is often done by the TRE. Very costly to undertake. +- 'Ground truth' of datasets/versioning if multiple versions are held across different TREs? + +### Next steps + +- Bringing the governance models close enough to each other to be harmonised enough to give FAIR data as the way forward +- to build a communication line between the legal entities called 'Data Owners' giving their data to 'Data Providers' to facilitate a balanced risk and outcome delivery **Primary objective in this model is to reach a balance and lower cost** +- Talk lines between TRE-community and government for implementation of regulations +- For specific use cases we have achieved it, where to go next is how to expand the solutions to a more general population-wide use case in multi-TRE work +- Streamline a repeated process for provisioning data via different paths +- In some sense we have the opportunity it feels. ie Cloud based models offer a way of rebuilding new +- Get rid of different instances of similar entities in a cloud environment +- Adoptation of the existing models (or creation of a new one???) +- We as a community need to be lobbying for a change of focus for data owners / controllers to make sure that they recognise that they have a purpose and responsbility to make the harder research problems possible, the time to research faster and the costs to do the research lower (time and £). If not this community then who? We need to act as a lobbying function into government policy makers. diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-nhs-sde.md b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-nhs-sde.md new file mode 100644 index 0000000..35d7dde --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-nhs-sde.md @@ -0,0 +1,40 @@ +# NHS England’s SDE Network + +**Chairs**: Adam Keeley (University of Leeds), Madalyn Hardaker (KCL) + +## Prompts + +- What is the future of independent TREs for health data? +- How will primary data collection and linkage be managed in the SDE network? +- What, if any, will be the accreditation standards SDEs hold them to? +- What are the current options for unmet technical capability? +- Do you foresee the new data access policy changing your workflows at all? If so, how? + +## Summary + +The NHS network, and direction, was discussed. + +The main principle is that health data will exist within SDEs, and that sub-optimal solutions will likely exist in the short term before full maturity of the SDE network is reached. + +The NHS SDE team hopes to be more directly involved with the community going forwards. + +## Raw notes + +- High level context of definitions +- 11 Regional SDEs across England + National NHSE SDE +- Different types of SDEs + - some curate only + - some curate + have technical environment for analysis +- NHS Research SDE Network - HDRI Gateway +- Challenges in federation, working across multiple TRE +- Direction of travel is that health data will exist in SDEs - policy will default to SDEs +- Do you go for a fully federated set up? +- Requires common set of standards +- There's a long road ahead in terms of maturity, will likely have to temporarily settle for sub-optimal solutions as the situation evolves +- Link to NHS England National SDE information - https://digital.nhs.uk/services/secure-data-environment-service +- Is there a way to get other TREs accredited for the network? + - Probs not for now, they won't be planning to accredit others outside the network + +### Next steps + +- Have an NHSE SDE policy team person attend a future session as well (March 2024) diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-ppie.md b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-ppie.md new file mode 100644 index 0000000..cce3e08 --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-ppie.md @@ -0,0 +1,43 @@ +# Community-based efforts and collaborations in public involvement and engagement + +**Chairs**: Tudor Besleaga (Sector Health Ltd), Claire Macdonald (Manchester University NHS Foundation Trust) + +## Useful reading: + +- [Consensus Statement on Public Involvement and Engagement with Data Intensive Health Research](https://ijpds.org/article/view/586/2829) +- [Goldacre Review](https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1067053/goldacre-review-using-health-data-for-research-and-analysis.pdf) - Chapter 5 Information Governance, Ethics and Participation + +## Prompts + +The need to conduct PPIE is well-recognised by the health research community at large, but how can we run it effectively? +There will be a brief presentation on public expectations from using personal health data for research. + +- Which TREs are carrying public engagement currently? (dedicated person or additional responsibilities) +- How many community members require PPIE support? Enough for a working group? +- How to overcome health / tech literacy limitations during public engagement activities? +- Public involvement - web landing page for public? Curated newsletter? + +## Summary + +This session was a broad exploration of how the public can better engage with both TRE teams and the UK TRE Community. + +Next steps focused on encouraging those working within PPIE spaces to join the UK TRE community. + +## Raw notes + +- Who is doing PPIE work in TREs - is it dedicated roles or additional work to other jobs? +- What support would be useful for people within TREs and researchers with regard to PPIE +- Is a survey needed to find this information or does this info already exist in previous landscape surveys e.g. PEDRI or DARE UK? +- Routes for the public to engage - a landing page or signpost to on TRE Community? +- Are there any general Public-requirements for TREs? + build upon section 4.8 in https://satre-specification.readthedocs.io/en/v1.0.0/pillars/supporting.html +- How TREs track and communicate data used - beyond just project info? Can patients find out what data is used? +- Discussion around opt out and impact on data of who it is +- How many PPIE professionals or people with PPIE as a priority joining the TRE Community sessions and mailing list? +- + +### Next steps + +- Discuss TRE Community at next PEDRI Delivery Group meeting and make sure updates and opportunities are communicated to TRE Comm network. PEDRI currently planning their focus for next year +- Explore PPIE support opportunities in bringing agency to the citizen +- Encourage PPIE roles from TREs to join TRE Community diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-provenance-metadata-standards.md b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-provenance-metadata-standards.md new file mode 100644 index 0000000..49e026a --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-provenance-metadata-standards.md @@ -0,0 +1,35 @@ +# The role of standards for provenance and metadata in federated TRE architectures + +**Chair**: TBA + +## Prompts + +- What metadata do you provide to researchers about data released to them? +- Do you conform to any existing standards of metadata when providing to researchers? Which standards? +- Do you utilise any metadata standards when publishing data externally? Which standards? +- Do you share/publish code with researchers? Why or why not? +- What are barriers to tracking and sharing provenance and metadata between organisations? What are ways to improve this? +- Outputs produced by technical solutions for provenance tracking might be inaccessible to non-technical decision makers. What can we do to address this? +- To what extent are rights/licence/access-conditions currently described _formally_ in metadata? + +## Summary + +The importance of data provenance was discussed, and how to ensure data provenance and additional metadata can be managed properly. + +There is still little common understanding of requesting metadata, and of data provenance requirements. + +Next steps are to map out these requirements for the community. + +## Raw notes + +- Key provenance questions: Where did the data come from? What data cleaning steps were applied? +- Big Data Approach -> collect as much as you can as you don’t know what you might need in the future. Provenance is important to preserve the knowledge that would otherwise disappear when people leave, etc. (e.g., example of longitudinal studies spanning number of decades ) +- Lots of provenance information kept in a human-readable form (screenshots, pdfs, text files) - it is also a preferred way of consuming such data (e.g., researcher wants to see a pdf of the questionaire that was used). Processing provenance at scale is difficult as it is resource intensive. +- Closer Discovery platform (https://discovery.closer.ac.uk/) was mentioned as an example of a metadata management platform for longitudinal populational studies. +- There is still no common way of requesting metadata – everybody is asking for data using different templates, which is very challenging for data processors. +- It has been highlighted that provenance could be problematic if it exposes too much detail (e.g., potential privacy implications) +- Provenance could be useful in determining if the data was indeed used in line with the permissions that was given. Currently this is found out only via manual process (e.g., publication for certain study type mentions the data but the permission was only given for a study of another condition) + +### Next steps + +- Map out provenance requirenments for the TRE community? diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-tre-federation.md b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-tre-federation.md new file mode 100644 index 0000000..9e88fee --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/discussion-tre-federation.md @@ -0,0 +1,96 @@ +# TRE Federation: appetite for a standards working group(s)? + +**Chair**: Rob Baxter (DARE UK) + +## Prompts + +Federating TREs to enable linked-data research is a big topic, with lots of current activity. +DARE UK have tried, in the spirit of George Box[^1], to draw the current threads together into a common architecture. +But what next? Forming a community working group to develop the architecture and technical standards seems like a good thing to do. + +- How much enthusiasm is there to form a WG? Assuming there is some… +- What should its scope be? Infrastructure standards? Query standards? Data? Governance? +- What starting points make sense? DARE UK architecture (v 2.0 coming soon)? Current tech ideas? +- What outputs would we aim for, on what timescales? +- How best would we organise and collaborate? Docs on GH webpages, with issues, boards etc? Something else? + +[^1]: _All models are wrong, but some are useful_ + +## Summary + +Conversation started by asking what work is already being done in this space. +This included exploring pre-existing standards, whether there is already a shared definition of federation, and where the use cases for federation already exist. + +The need to map the ecosystem was highlighted, to show where the influence lies and how key decisions are made - especially in spaces where the data is not allowed to move out of internal systems. + +The need to factor in data and information governance, in tandem with any technical solution, was highlighted. + +Next steps include setting up a high-level federation working group, and use this to explore critical topics within the idea of federation. + +## Raw notes + +- Important to note that there are organisations that are already working in this space - their meetings will be set. +- Question: what other working groups exist and why do we need a new one? What aren't they covering? +- Question: Are there any existing standards? + - Small scale projects have succeeded - some funded by DARE UK - but not broad demonstration... yet? + - Teleport: + - https://dareuk.org.uk/driver-project-teleport + - https://dareuk.org.uk/preserving-public-trust-in-the-evolution-of-trusted-research-environments-teleport-federated-data-access + - TRE-FX: + - https://dareuk.org.uk/driver-project-tre-fx/ + - Final report from the TRE-FX DARE UK project: https://zenodo.org/records/10055354 +- Question: Do we have a shared definition of federation? + - How is that different to data pooling? + - Should we be aiming for a meta-learning? distributed learning? + - Potential definition: federation is a group of TREs that trust each other + - Doesn't have to involve moving data... but can... +- Examples of federation in the genomics space + - Question: what does it mean that the federation is specific to genomics? + - No limit on the fedearation architecture but front end standardisation would need to assess the data for egress. +- Question: Do we have a list of use-cases that describe the need for federation to exist. + - Yes for small ones but not coherently collected in one place. + - One idea: run a workshop to build up "grand challenge" style definitions +- From the zoom chat: + - The 'trustworthy' and 'governance' bits depend entirely on what information flows over this inter-SDE network. + - https://dash.ohdsi.org/research + - https://www.datashield.org/about/about-datashield-collated + - the above are examples of existing use-cases +- Note that the individual TREs publish many more research outputs than OHDSI - interested to know why that is - what are the needs that OHDSI aren't meeting? +- Improved analyses / insights / inference with very large data sets + - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6451771/ + - Note though that without a common data model it is very hard to combine data for such large scale analyses. + - Common data standards are very expensive to implement. + - Not total agreement in the room. + - But note that the mapping to a data standard - eg OMOP - is very variable. +- Note that many goverment departments / organisations are focusing on "data not moving" - so where is the space for federation? + - "lift and shift" stops being able to scale very well.... +- Need to map the ecosystem of working groups + - Need to understand who has the power and influence... + - "A federated protocol for the working groups!" +- Note that the design of a TRE is conceptualised to bring researchers in and to NOT let data out... + - So do we need a concept beyond a TRE? Do we need a new phrase / name? + - TREs that are designed from the outset to federate? +- DataSHIELD has been around for 10 years - not complete but huge open source community of researchers. + - And there is this -> https://www.hpe.com/us/en/solutions/ai-artificial-intelligence.html +- Data always move - even Genomics which are years ahead, the raw data does not move but the findings from the genomics do - that's data +- NHS SDE - adopting OMOP I believe - seeking to share maps across SDEs rather than each doing own +- I think the big challenge is around people understanding this space and removing the IG blockers. +- The NW SDE is 'federated' by design. +- How does anyone validate federated analyses if data doesn't move! You can't see the outputs... +- Addressing the techie part of federation without factoring in the data & governance pieces is only a fraction of answer. + - These things do run at different speeds + +### Next steps + +- Propose a high-level federation WG (IG?) as an umbrella/place to start, using the UK TRE/DARE UK/RDA model + - Prob host on GitHub somewhere + - Use DARE UK Federated Architecture Blueprint v2.0 to seed (coming soon!) +- Within that, tease out the best approaches to key topics: + - Federation terminology (cf. Pete/Madalyn's idea, and the [DARE UK Driver common vocab](https://docs.google.com/document/d/1SJ6CJG8yHzsvtU7MyzdNOF_S0fZVJb_i/edit) ) + - What is a TRE? Is it an individual "cluster of boxes", or can it be a formal federation of a number of "clusters of boxes"? (a super-cluster?) + - What does TRE accreditation for a piece of public cloud mean, for instance? + - Is the NHS SDE Network a TRE in itself? + - What is the sound of one hand clapping? + - Governance around this broader idea of a TRE as a "federation of smaller sub-TREs" + - Is "thinking federal" from the get-go useful? Possible? + - Data & queries: how can we ever harmonise data enough for a federated query to return comparable answers? diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/index.md b/docs/events/wg_workshops/2023-12-05-december-meeting/index.md new file mode 100644 index 0000000..db8dce1 --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/index.md @@ -0,0 +1,197 @@ +# UK TRE Community meeting - December 2023 + +:Date: Tuesday 5th December 2023 13:30-17:00 +:Slides: https://zenodo.org/records/10370931 +:Recording: https://www.youtube.com/watch?v=LbxLZudqjOA + +```{toctree} +:maxdepth: 1 +:hidden: true + +discussion-local-cloud-hosting +discussion-multi-tre +discussion-nhs-sde +discussion-ppie +discussion-provenance-metadata-standards +discussion-tre-federation +workshop-automated-output-checking-coi +workshop-manual-output-checking +workshop-package-allow-lists +workshop-researcher-passports +workshop-satre +workshop-tre-c4-architecture +``` + +## Background + +​The UK TRE Community is a community of over 200 people that has grown organically over the last year for anyone interested in TREs, including researchers, operators, information governors, managers and more, from all sectors and disciplines. + +​The core aims of fostering collaboration and sharing of innovative ideas to support the delivery of groundbreaking research with sensitive data have resonated across the UK and beyond. + +​The community has a website, an active mailing list and Slack channel, and runs quarterly events like this for the community to come together, discuss ideas and problems within the TRE space and work collaboratively together on possible solutions and ways forward! + +## Agenda + +| Time | Agenda Item | +| ------------- | -------------------------------------------- | +| 13:30 - 13:45 | Welcome and intro | +| 13:45 - 14:30 | [Keynote + discussion](#keynote) | +| 14:30 - 14:45 | [Community updates](#community-updates) | +| 14:45 - 14:55 | Break | +| 14:55 - 15:00 | Intro to breakout session 1 | +| 15:00 - 15:45 | Breakout session 1 [(see below)](#session-1) | +| 15:45 - 15:55 | Break | +| 15:55 - 16:00 | Intro to breakout session 2 | +| 16:00 - 16:45 | Breakout session 2 [(see below)](#session-2) | +| 16:45 - 17:00 | Wrap up | + +### Keynote + +A talk by the Community Management Working Group on the latest developments, funding and plans for the community. + +#### Summary + +The talk discussed why the UK TRE Community exists, and the plans for the community for the next 4 months. + +The community considers itself to be a connecting force in the already very active and energetic UK TRE space, helping projects and teams in bringing their work to the wider community, achieving consensus and moving towards a shared approach to TRE provision across the UK's 4 nations. + +The community shared an initial vision and mission for the community: + +```{admonition} Vision +:class: hint + +To have a four nations, coordinated, collaborative approach to research infrastructure for working with sensitive data +``` + +```{admonition} Mission +:class: hint + +To provide the space to signpost, share knowledge and facilitate conversations across the UK TRE landscape +``` + +The talk then went into detail on the proposed work for the next 4 months. +This was based on the proposal submitted for the DARE UK Community groups funding. +More detail can be found in the [full proposal](https://zenodo.org/records/10593493) + +### Community updates + +_A chance for anyone in the community to share quick updates with everyone on the call._ + +#### SDEs, TREs etc - terminology and definitions working group introduction + +_Pete Barnsley & Madalyn Hardaker_ + +**Contact**: [Madalyn Hardaker](mailto:madalyn.hardaker@kcl.ac.uk) + +Pete advertised a new working group to be formed called `SDEs, TREs etc - terminology and definitions: A clear lexicon for the community's architecture`. + +There will two open sessions in January: + +- Tuesday 9th January 2pm-3pm +- Wednesday 17th January 3pm-4pm + +To cover topics like: + +- Expectations +- Ideas and scope +- Plan for completing output +- Liasing with other related working groups + +Plan is to present at the March meeting, and have a first outcome by Spring. + +#### The citizen and their increased / retained agency working group introduction + +_Pete Barnsley_ + +Pete advertised a new working group to be formed called `The citizen and their increased / retained agency`. + +There will two open sessions in January: + +- Tuesday 9th January 11am-12pm +- Wednesday 17th January 10am-11am + +To cover topics like: + +- Expectations +- Ideas and scope +- Plan for completing output +- Liasing with other related working groups + +Plan is to present at the March meeting, and have a first outcome of a whitepaper. + +#### An architecture for extending the domain of control - custody of data working group introduction + +_Pete Barnsley_ + +Pete advertised a new working group to be formed called `An architecture for extending the domain of control - custody of data`. + +There will two open sessions in January: + +- Wednesday 10th January 10am-11am +- Tuesday 16th January 2pm-3pm + +To cover topics like: + +- Expectations +- Ideas and scope +- Plan for completing output +- Liasing with other related working groups + +Plan is to present at the March meeting, and have a first outcome of a whitepaper. + +#### Researcher passports + +_Emily Jefferson_ + +**Contact**: [Loki Sinclair](mailto:loki.sinclair@hdruk.ac.uk) and [Fergus McDonald](mailto:Fergus.McDonald@dareuk.org.uk) + +There may be funding from UKRI for researcher passports. + +The team would like to work with teams to see what would be helpful in this area. + +#### AWS Research Engineering Studio + +_Simon Li_ + +**Contact**: [Simon Li](mailto:spli@dundee.ac.uk) + +AWS are revamping their TRE offerings. +They have a new open-source product "Research Engineering Studio" which replaces https://github.com/awslabs/service-workbench-on-aws + +More info can be found here: https://docs.aws.amazon.com/res/latest/ug/overview.html and here: https://github.com/aws/res + +HIC are test-driving it, and are keen to share notes and ideas with anyone else looking or planning to try it. + +Get in touch with Simon if you are interested and he'll set up a meeting for January! + +#### Community of Interest on semi-automated evaluation + +_Jim Smith_ + +Jim advertised the breakout session happening on the DARE UK Community of Interest on semi-automated evaluation. + +### Breakout sessions + +Breakout rooms were a mixture of workshops (more structured sessions designed around a particular problem area, focused on moving towards an output) and discussions (less structured sessions to explore general areas of interest within the UK TRE space), as well as open rooms to have general discussions. + +There were two sessions on the day of 45 minutes each. + +Summaries and notes are available on the links below. + +#### Session 1 + +- [](./workshop-researcher-passports.md) - workshop +- [](./workshop-automated-output-checking-coi.md) - workshop +- [](./workshop-tre-c4-architecture.md) - workshop +- [](./discussion-local-cloud-hosting.md) - discussion +- [](./discussion-tre-federation.md) - discussion +- [](./discussion-ppie.md) - discussion + +#### Session 2 + +- [](./workshop-package-allow-lists.md) - workshop +- [](./workshop-manual-output-checking.md) - workshop +- [](./workshop-satre.md) - workshop +- [](./discussion-nhs-sde.md) - discussion +- [](./discussion-provenance-metadata-standards.md) - discussion +- [](./discussion-multi-tre.md) - discussion diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-automated-output-checking-coi.md b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-automated-output-checking-coi.md new file mode 100644 index 0000000..0e1c143 --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-automated-output-checking-coi.md @@ -0,0 +1,61 @@ +# Evaluation of Automated Output Checking and AI Model Risk Assessment Community of Interest + +**Leads**: Jim Smith (University of the West of England), Jackie Caldwell (PHS) + +## Proposal + +### Summary + +The session is intended to give an overview of the Community of Interest on automated risk assessment. +This naturally includes the risk assessment of AI models, since this is not something that can be done manually. + +The aim is to: +(1) introduce the Community of Interest group, and its aim of reducing barriers to adoption of (semi) automated checking, +(2) make some proposals about how we plan to move forward on this alongside UK-TRE, +(3) to get feedback from people present about and how they would like to se the community develop and work + +### Preparation + +No required preparation beyond an open mind! + +If people would like some perspective on where the project has evolved from, it might be useful to skim-read the first 5-6 pages of [the SACRO project's final report](https://zenodo.org/records/10055365) + +But **please note** this Community of Interest has a broader remit than just SACRO - for example, by design we include projects such as DataShield, as well as other approaches for assessing ML privacy leakage. + +### Target audience + +No specific target audience in mind - anybody interested! + +## Session + +### Summary + +THe workshop explored projects already exploring these issues, what the priorities of the community should be, and how to align everything already happening in this space. + +A name was also chosen for the community! +ReBOT, Reducing Barriers to Outputs from TREs. + +Next steps include setting up a Jisc mailing list for the community, and a simple accessible guide, either written or video. + +### Raw notes + +- Start of a community of people looking at these tools +- Some projects have started tackling issues: DataShield, ACRO, GRAIMatter, SACRO +- Defining best practice + - Aligning finding of these projects with SDAP manual +- Remove barriers to adoption by researchers + - Weekly drop-in sessions for ACRO, AISDC, SACRO, etc + - Email support service +- Name of the community? ReBOT, Reducing Barriers to Outputs from TREs +- What does everyone think? + - Manual egress is just not scalable + - What are we actually trying to protect? From who/what? + - Statistical disclosure policy, flowchart to follow, if not straightforward then document decision making, take consensus of senior members of team + - [Public Health Scotland: Statistical disclosure protocol](https://publichealthscotland.scot/publications/public-health-scotland-statistical-disclosure-protocol/public-health-scotland-statistical-disclosure-protocol-version-21/) + - [Handbook on Statistical Disclosure Control for Outputs](https://ukdataservice.ac.uk/app/uploads/thf_datareport_aw_web.pdf) + - [ICO AI Toolkit](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/how-do-we-ensure-lawfulness-in-ai/) + +#### Next steps + +- Create new JISC mailing list +- Create simple accessible guide, either written or video diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-manual-output-checking.md b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-manual-output-checking.md new file mode 100644 index 0000000..bf902a3 --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-manual-output-checking.md @@ -0,0 +1,56 @@ +# Streamlining manual output checking + +**Lead:** Rachael Williams (MHRA) + +## Proposal + +### Summary + +This workshop will bring the community together to brainstorm strategies and tools for streamlining manual output checking in TREs. + +Whilst automated output checking is going some way to minimising bottlenecks, manual verification is seen by many as a crucial step in maintaining confidentiality and in making sure that outputs align with the intentions of individual projects. +This manual step can be time-consuming and prone to errors. + +Participants will brainstorm effective techniques for optimising this process, including best practices to enhance efficiency without compromising governance. + +By the end of the workshop, it is hoped that participants will be equipped with practical insights to enhance the speed and accuracy of manual output checking, ultimately improving the overall research workflow. + +### Preparation + +Please bring your experience, ideas, and questions, around what works and what doesn’t in the world of manual output checking, such as checklist development, collaborative workflows, and quality assurance measures. + +### Target audience + +All involved in manual output checking – both from a policy and procedures perspective, and with hands on experience. + +## Session + +### Summary + +The room discussed how manual methods of output checking can be connected to more automatic methods, for instance SACRO, and how organisations can transition from purely manual methods to more automated ones. + +Tips were also shared on how to make manual checking simpler. + +### Raw notes + +- SACRO provides a set of drop in tools that researchers use alongside R or Stata - and at the stage they want to create an output they type "acro". It will then run checks and produce an output - highlighting if there are potential issues. Sometimes the automated checks won't apply - and an exception request can be submitted. +- For machine learning models there are a number of things that can be checked. The gap (where work is ongoing) is how to apply the methods for traditional methods to machine learning models. A pool of expertise is being built in this area. +- Organisations which have always used a TRE, where researchers are used to following certain procedures, may have fewer issues than organisations who have used other models of data access in the past (in terms of volumes of exception requests). + +- TIPS + + - Try and encourage researchers to do more than the minimum - encourage them to undertake the Safe Principals course so they understand how to police themselves. + - Request a certain format/template to make it easier for both automated and manual checks. + - Encourage project lead to approve prior to submitting for automated checks. + - Options such as Azure Cognitive Search have been used successfully in the financial sector previously to look for individual PII. + - SACRO also provides options for adding comments that can be shared between researcher and reviewer through the process. + +- KEY MESSAGES + + - Early engagement and education with your researchers is key. + - Explore tools that can do automated checking - but allow for the ongoing conversation. + +- LINKS + - SACRO final report: https://zenodo.org/records/10055365 + - GitHub repos and more docs: https://github.com/AI-SDC + - ICO toolkit: https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/guidance-on-ai-and-data-protection/ai-and-data-protection-risk-toolkit/ diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-package-allow-lists.md b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-package-allow-lists.md new file mode 100644 index 0000000..99462d2 --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-package-allow-lists.md @@ -0,0 +1,34 @@ +# Package allow lists + +**Lead:** Jim Madge (The Alan Turing Institute) + +## Proposal + +### Summary + +The [packages repository](https://github.com/uk-tre/packages) was created recently as a place for TRE operators to share their decisions on packages allowed in their TREs. + +The lists are structured as JSON documents conforming to a schema, which sets out what information is required. +In this way, the lists can easily be validated and parsed to be used by other tools. + +In the workshop we'll discuss adding your allowlists to the repository, making improvements and building tooling to use the lists. + +We hope that sharing our decisions like this will encourage us to be honest and confident about our security and to benefit from using the data. + +### Preparation + +Attendees are encouraged to look at [the repository](https://github.com/uk-tre/packages) beforehand and bring ideas to the workshop. + +For example, you might want to add your own organisation, you might want to propose a change to the schema, you might want to add tooling to create, modify or analyse the lists. + +A good output would be to open a pull request or issue suggesting improvements. + +### Target audience + +TRE builders and operators with an interest in managing packages in TREs. + +The session will be quite technical - the repository currently uses JSONSchema and Python. + +## Session + +This session did not take place. diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-researcher-passports.md b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-researcher-passports.md new file mode 100644 index 0000000..1acb529 --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-researcher-passports.md @@ -0,0 +1,140 @@ +# Researcher Passports + +**Leads:** Loki Sinclair (HDR UK) & Fergus McDonald (DARE UK) + +## Proposal + +### Summary + +The purpose of the session is to get a sense from the TRE Community – primarily those running and operating SDEs/TREs – that if a nationwide service were available to verify the “Safe People” criteria would that be beneficial to SDE/TRE operators? If so, what are the requirements for such a service? We would also like to understand the appetite for supporting single sign on across TREs/SDEs. + +### Preparation + +We’d encourage community members to reflect on the following questions: + +- Are you aware of any other groups developing a “researcher passport” that you think we should know about or be collaborating with? +- As someone who maintains a secure data and/or a trusted research environment, what factors do you consider most when approving researchers “safe people” (from 5 safes model). E.g. + - What training courses they have been on? + - Depth of experience of working with sensitive data? + - Previous breaches of your terms of access or other policies? + - Location i.e. Are they physically present in the UK? + - Nature of relationship between the researcher and their organisation? + - i.e. Are they an employee, secondee, under an honorary contract etc? + - Organisation a researcher works for? + - If there is an agreement in place with the organisation + - Identity validation such as a copy of driving license? + - ORCID? + - Ethical approvals? + - Data governance access approvals? + - Projects they have done research on? + - Any conflicts of interest/commercial interests e.g. works for a university but also is the director of an SME working in the same field? +- Similarly, what factors do you consider when approving institutes and/or organisations? +- What evidence must a researcher provide to prove compliance with your requirements/the factors above in the eyes of your organisation, prior to being granted access to your environment? +- What evidence must the host organisation of a researcher provide to be considered a “safe setting” organisation to work with? +- If a system could collate all the information you need to consider a bona fide researcher into a single place for review, would this be helpful for you? Are you likely to use such a system? +- In an ideal scenario, what method of singular authentication and researcher approval would you envision streamlining your entire process? + - Feel free to brainstorm and outline an optimal software solution that not only caters for user authentication (I.e. SSO), but also verification and data access being granted. +- When dealing with inappropriate data use, what processes do you follow? +- If a system could offer a complete history of a researcher’s data access approvals from other institutes and potential red flagging, would this be considered helpful or harmful? +- If there was a UK wide solution supporting single sign on for researchers accessing sensitive data, how likely are you to support such an access model within your existing methods for researcher account authorisation? +- Federated analysis projects are likely to require a single sign of for researchers across TREs. To support such projects do you have any preferences on what single sign on/authorisation standard is adopted across the TRE community? Any thoughts on what would help support these types of projects? +- What is the most important aspect when considering feature development and implementation of external interoperable systems within your institute? For example: + - Do you consider ease of implementation over feature-set or does feature-set on offer ultimately determine solution adoption? + And finally... +- Would you be interested in working with us to provide requirements and test a researcher passport solution? + +#### Reading: + +- Brophy, R., Bellavia, E., Bluemink, M. G., Evans, K., Hashimi, M., Macaulay, Y., McNamara, E., Noble, A., Quattroni, P., Rudczenko, A., Morris, A. D., Smith, C. and Boyd, A. (2023) “Towards a standardised cross-sectoral data access agreement template for research: a core set of principles for data access within trusted research environments”, International Journal of Population Data Science, 8(4). doi: 10.23889/ijpds.v8i4.2169. + - [Publication](https://ijpds.org/article/view/2169) + - [Template agreement](https://zenodo.org/records/8256235) + +##### Extracts from the [DARE UK Phase 1 Recommendations](https://zenodo.org/records/7022440): + +###### Federated identity and user authentication standards + +There is a need to identify – in collaboration with stakeholders from across the landscape – and drive forward the adoption of a common user authentication protocol by infrastructure providers. +Conceivably, this would need to be coordinated and overseen by UKRI itself, as it has the appropriate remit to act as such an authority. +A transparent, cross-domain, national approach could remove the responsibility from individual groups and therefore improve consistency and increase efficiency across the sensitive data research ecosystem. + +Stakeholders engaged with during DARE UK Phase 1 have highlighted that this is a prerequisite for all forms of federation to occur and will aide in the creation of a ‘research passport’ that is cross-linked to multiple regulatory bodies for verification and validation by data custodians. +Stakeholders also highlighted existing federations, for example the UK Access Management Federation for Education and Research, which will need to either be expanded or linked to other federations being created, such as NHS Care Identity Service 2 (CIS2) or GovRoam. +The existence of modern industry and community standards of user authentication (for example, SAML, OIDC, OAUTH2 and Global Alliance for Genomics and Health (GA4GH) Passports) were also highlighted. +These existing standards should be leveraged as the basis for user authentication to allow for maximum interoperability at a national and international level. +As user authentication is a crucial component of a national TRE standard, stakeholders also highlighted the need to support different forms of identity verification and have logging and auditing embedded across the system. + +###### Researcher accreditation + +A key requirement highlighted by stakeholders has been the need for a streamlined approach to researcher accreditation. +While there are a number of existing training modules for sensitive data handling (for example, those provided by ONS), many of these trainings are duplicative without allowing for equivalence or mutual recognition between modules. +Those engaged with during Phase 1 highlighted the need to develop a shared standard with service users and providers towards a federated approach to training content. +Modularisation was also highlighted as important to allow for flexibility to cater for specific data modalities or sensitivities, for example through ‘core’ modules as a standard foundation for all accreditation courses with the possibility of ‘extended’ modules in specific cases or contexts as needed. +Stakeholders affirmed that work to standardise and streamline the researcher accreditation process was sorely needed, along with reciprocal or mutual recognition of accreditation by different TRE providers. +Providers should aim to offer a consistent researcher experience across data access points, and ideally make the process feel as though the researcher were accessing data on their own machine when this is not the case. +Training could be made portable across TREs through standard accreditation for researchers acting as a TRE ‘passport’. +The Digital Economy Act, 2017 (DEA) already works as a passport in some respects, with shared accreditation existing across certain TREs. + +###### Private sector and international researcher accreditation + +Currently, private sector researchers can apply to become accredited researchers under the DEA, and therefore apply for access to data held within DEA-accredited research environments once accredited, via the same process as academic researchers. +In the context of UKRI-funded research, private sector researchers can also participate in sensitive data research in the public good as part of consortia led by a UKRI-approved research organisation. +However, there was widespread feedback from stakeholders engaged with during DARE UK Phase 1 that improving the ability for private sector researchers to collaborate on sensitive data research is important. +Participants of the DARE UK public dialogue wanted sensitive data to be made securely accessible to private sector organisations and did not see a need for data access requirements to differ for these organisations, as long as the research is motivated by public benefit over financial profit and there is transparency throughout the research lifecycle (see Chapter 3: Demonstrating trustworthiness). + +### Target audience + +TRE operators and/or information governance professionals (as related to the requirements for approving researcher access). + +## Session + +### Summary + +It was highlighted how resource intensive the current manual researcher approval process at UK biobank is, and how an autonated system would help with this. + +Achieving the perfect system may take a long time, but an MVP-style system that verifies researchers' identity and project history could be a good starting point. + +There was a question of whether an independent body needs to exist to accredit these passports, or whether providing appropriate researcher details would be sufficient on a TRE-by-TRE basis. + +Further discussions centred around processes to interrogate validity of researcher records, how to record 'untrustworthy' researchers, and how to accommodate different levels of permission for different researchers, depending on how sensitive the project they are working on is. + +### Raw notes + +- UK Biobank working with additional biobanks in pipeline (international perspective additionally challenging) +- 6 FTE managing the overall workload of researcher approval +- Do not make a quality assessment of a researcher +- Main process currently followed: + - Applicant must supply CV + - Applicant supplies any relevant papers/publications + - Email / Telephone number provided + - Institute contacted to check alignment +- Since inception, ~30,000 applicants. Very manual, painful process. Process takes between ~10-60mins per application. +- Would other TRE/SDEs be willing to use the collected researcher verifications already accepted from UK Biobank DB? +- An automated system that provided a full history of researchers would free up time from our point of view. We don't vet their skills or what coding languages they're familiar with. Just that they are who they say they are. +- The perfect system could take "15 years" to cover all eventualities. Start with the basics to build an MVP style verification. A global "fits all" will likely end up in nothing being delivered. +- Bona fide researcher + - Approved researcher training (e.g. ONS/UKSA approved researcher training) + - Is a system like this needed? Yes, but adoption becomes key. Our numbers are smaller than those of very established TREs. + - 5 safe's already sets the path for people verification + - Process: + - Due diligence against bona fide "employee and employer" - treated as a single entity by LLC. + - Conflicts. Look at CV, training and track record. How people who have multiple employers may have conflicts in application. + - Question: + - Who is going to do the accrediting of these passports? + - TREs are a step removed from this and it should be external to us and better by an independent adjudicator + - Does anyone _need_ to accredit a passport, with full history of researcher activity being sent - would that be enough to make a decision at TRE level? + - It's about how trustworthy the details are? Due diligence can't be based on "trust" alone. +- There is a marginal benefit to a system that could provide this information in a passport "stamping" fashion. +- Only UK accessible (at least for now). Requires ONS researcher training. Contact institution for validation. Ethics project level review process, where it confirms UKSA template, and if a project meets criteria then a panel review for "public good". + - One of the key issues is about timeliness of accessing data. Current records that ONS and/or UKSA have don't reflect changes. + - How do you "block" someone off if they have damaged another TRE? More from the operational aspect + - How can we trust the information - implication being that we can interrogate and get answers, but can we trust the data being entered? + - Who can accredit? + - What secure methods can be designed to prevent tampering? + - Part of the process would mainly be removing the tedious processes behind the vetting. To some extent it doesn't need to _say_ anything about the history, just provide key indicators. + - Discussions surrounding structure about "what do we mean", "who do we trust". Implementing a system as an ORCID++ to later build on trust info to interrogate would be a good starting point. Separate issue about trustworthy facts being available and current / "live" can come later. +- How do we capture the 'experience' of the user? + - H-Index like system tied to an ORCID? +- After a series of projects, open subscription account as reward. Self-triaged system for FO, in a more trusted env than machine. Demonstrating a trusted user. +- OpenSafely have differing tiers. Defining different levels of "permissions" depending on status of trust for user. +- History of TRE access would lead to federation of mapping users with access to multiple environments. +- Building 'credit' feature to define level of trust and access tier. diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-satre.md b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-satre.md new file mode 100644 index 0000000..ec9cf9b --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-satre.md @@ -0,0 +1,99 @@ +# SATRE next steps + +**Lead:** Chris Cole (University of Dundee) + +## Proposal + +### Summary + +After extensive community engagement often in conjunction with UK TRE, the first version of the SATRE Specification was launched in October. +It is by no means finished, but it is now stable and ready for use. + +This current phase of the SATRE journey to "kick the tyres" and use the evaluation process to either assess your TRE, if you have one, or use it test your assumptions/knowledge of TREs. +SATRE was created quickly and there will be parts that can be improved. + +Today is a chance to find out more of the continuing plans for SATRE within the UK TRE Community, contribute and ask questions. + +### Preparation + +If you’re not already familiar with the SATRE Specification please have a look at it here: +https://satre-specification.readthedocs.io + +We also have two videos for everyone to view and feel free to share with your friends! + +- https://www.youtube.com/watch?v=auExNHEGwcc +- https://www.youtube.com/watch?v=kzUU5ljII0Q + +### Target audience + +Users, researchers, implementers of TREs and of course any public members who wish to help support the transparency and openness of the project. + +## Session + +### Summary + +SATRE was introduced to the room, before the idea of a SATRE working group was discussed, and how it could align with ongoing work in the UK TRE community. + +Alignment between UKSA and SDE accreditation was also discussed, as well as organisations carrying out SATRE evaluation. + +Next steps focused on a January meeting for a working group next steps, with a focus on helping other orgs evaluate themselves + +### Raw notes + +#### What is SATRE? + +- DARE UK funded project +- Focus to be a community project +- Led by HIC, Dundee; The Alan Turing Institute; UCL +- https://satre-specification.readthedocs.io/en/latest/ +- Learn and explore +- Developed specification +- Now in maintain phase +- Considered public impact and had patient engagement +- Guide to build and run TRE +- UK Stats authority see it as a stepping stone towards e.g. Digital Economy Act certification +- [Spreadsheet](https://satre-specification.readthedocs.io/en/latest/satre.xlsx) to evaluate your TRE against the 160 statements + +What capacity is required to create a viable SATRE Working Group? What tasks do we need to complete? + +- The NWSDE's Technical Design Authority (TDA) has a team looking at compliance of our tech and processes with SATRE (while we wait to find out more about the UKSA accreditation framework) +- What happens next? + - Repo could stagnate + - Funding could support ongoing maintenance/development + - What is needed to smooth out support + - What voices were missed out in the first phase? +- Similar questions have been asked about UK TRE Community +- Working groups fundamental, strongly hope to have support mechanisms available in the next phase + - SATRE Fundamental to day job of defining a federated architecture of TRE + - Pulled together driver projects to form architecture, SATRE forms part of that ongoing picture + - Third avenue is feeding into emerging working group at Research Data Alliance. See [Recent Plenary BoF session](https://www.rd-alliance.org/trusted-research-environments-sensitive-data-fairness-closed-data-and-processes) and [draft WG charter](https://docs.google.com/document/d/1877OtQyZ46QCHVZ8_1QqRgZJPXyImZdW1qYVyKItMSQ/edit). +- Great that SATRE involved public, reflecting this in the specification would be good for funding, and to highlight directly in the specifications where the project has reflected public/patient requests. +- There are specific statements that came out of public involvement. +- re above points, aligned with TRE Community application but understanding needs for funding and gaps is key. +- Build on public engagement, on that angle, nature of spec neeeds be broad and high level, important in the public engagement section. Overlaps with other work in the wider community. +- People present happy to Chair a working group +- Happy to co-Chair while aware of appearance of conflict. +- Another call for the SATRE working group before Christmas to be ready for January and identify key priorities +- Happy to share what they are doing with the SATRE team to get feedback on their approach + +How do SDE Accreditation frameworks from UKSA and SATRE fit together? + +- And how do the UKSA/NHS conversations fit? + - Anyone know how those are going? + +Thoughts/comments on SATRE evaluation? + +- NWSDE are doing SATREfication of their TRE using GitHub rather than spreadsheet + - Have an issue for each statement + - Aim to get to green on mandatory/amber with exception in place in first phase + - Have created [issue to define spec in a machine readable format](https://github.com/sa-tre/satre-specification/issues/254) +- The SATRE scoring system could have subsection SATRE-PPI scoring how many of the public requirements are being met +- Consensus there is a lot of value in having evaluation shared and exploring how others have scored but also process that goes into evaluation. + +### Next steps + +- Book a time in January for next meeting +- Focus on evaluations with other groups +- PIE - overlaps with PEDRI and continue engagements +- Get alignment with Research Data Alliance? +- CC nominated as chair to lead SATRE working group and explore means of succession funding diff --git a/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-tre-c4-architecture.md b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-tre-c4-architecture.md new file mode 100644 index 0000000..00e351c --- /dev/null +++ b/docs/events/wg_workshops/2023-12-05-december-meeting/workshop-tre-c4-architecture.md @@ -0,0 +1,60 @@ +# Data architecture of a TRE with C4 modelling language + +**Lead:** Joe Leach (Tower Hamlets) + +## Proposal + +### Summary + +This workshop will run some research questions through a draft TRE design for a HDRC (Health Determinants Research Collaborative). +This design exposes the interfaces between architecture and trust for regulatory control of research data management. +We will demonstrate how to: + +1. Run a catalogue of fresh metadata describing a network of data controllers (council and health services) + - Enrich metadata (e.g. by describing data quality dimensions) +2. Support reproducible analytical pipelines that run inside a TRE to: + - Receive de-identified data from different sources, each of which has applied the same encryption key to identifiers. + - The encryption key is maintained by a third-party trust (e.g. another HDRC) + - Link de-identified records (e.g. education/housing/health) + - Run analyses in-place + - Perform statistical disclosure control +3. Output trusted analyses + +You’d be forgiven for thinking this TRE sounds like a sandbox, but the special ingredient here is the implementation of governance protocols, checking and balancing data uses at key turning points. + +To help communicate this approach to TRE design, we have experimented with analogies such as central reference libraries and air traffic control, and will workshop some examples to round off this thought experiment. + +### Preparation + +Ideally participants would be familiar with [The Goldacre Review](https://www.goldacrereview.org/), though this is not critical! + +### Target audience + +Colleagues from data science, engineering, and governance. + +## Session + +### Summary + +This workshop introduced the idea of TREs within local council work, making specific reference to the London NHS SNSDE, and how Structurizr can be used for rendering C4 diagrams of data architecture. + +Next steps include publishing the design that was workshopped and discussed. + +### Raw notes + +- Brief introduction of London Borough of Tower Hamlets and Health Determinants Research Collaboration (HDRC) +- Brief Introduction - subnational NHS SDE +- The importance of metadata architecture as a foundation to TRE +- Introduction to C4 model: https://c4model.com/ +- TRE design in HDRC/ local authority: + - To comply with the data governance, the linked dataset (administrative and NHS data) needs to be stored in data haven that are accredited. + - Such challenges in data storage and handling make it difficult to facilitate a collaborative research environment under the current data system. + - As such, TRE provides a possible solution to offer a playground/ sandbox to allow researchers from council, community, academia etc. to access and analyse the dataset. + - Still, the team still faces other issues, IG of NHS data, pipeline to get linked dataset out of the data haven to the TRE, etc. +- Regional London SDE applies to governance on health data, especially for linking +- Subnational SDEs not quite ready for this type of research! +- Usefulness of Structurizr for rendering C4 diagrams of data architecture - as a text based modelling language, you can do your version control with git! - you can incorporate data models (Entity Relations) as images at the lowest "Code" level perspective of diagramming + +#### Next steps + +- publish the design discussed? diff --git a/docs/events/wg_workshops/2024-03-14-march-meeting/index.md b/docs/events/wg_workshops/2024-03-14-march-meeting/index.md new file mode 100644 index 0000000..8b731c7 --- /dev/null +++ b/docs/events/wg_workshops/2024-03-14-march-meeting/index.md @@ -0,0 +1,48 @@ +# UK TRE Community meeting - March 2024 + +:Date: [Thursday 14th March 2024](https://arewemeetingyet.com/London/2024-03-14/09:30/UK%20TRE%20Community%20meeting) +:Time: 09:30 - 13:00 +:Registration: https://lu.ma/n7yybonh +:Location: Online + +## Background + +​​The UK TRE Community is a community of over 200 people that has grown organically over the last year for anyone interested in TREs, including researchers, operators, information governors, managers and more, from all sectors and disciplines. + +​​The core aims of fostering collaboration and sharing of innovative ideas to support the delivery of groundbreaking research with sensitive data have resonated across the UK and beyond. + +​​The community has a website, an active mailing list and Slack channel, and working groups tackling shared problems. +We also run quarterly events like this for the community to come together, discuss ideas and problems within the TRE space and work collaboratively together on possible solutions and ways forward! + +## Agenda + +| Time | Agenda Item | +| ------------- | -------------------------------------------- | +| 09:30 - 09:45 | Welcome and intro | +| 09:45 - 10:30 | [Keynote + discussion](#keynote) | +| 10:30 - 10:45 | [Community updates](#community-updates) | +| 10:45 - 10:55 | Break | +| 10:55 - 11:00 | Intro to breakout session 1 | +| 11:00 - 11:45 | Breakout session 1 [(see below)](#session-1) | +| 11:45 - 11:55 | Break | +| 11:55 - 12:00 | Intro to breakout session 2 | +| 12:00 - 12:45 | Breakout session 2 [(see below)](#session-2) | +| 12:45 - 13:00 | Wrap up | + +### Keynote + +TBA + +### Community updates + +TBA + +### Breakout sessions + +#### Session 1 + +TBA + +#### Session 2 + +TBA diff --git a/docs/events/wg_workshops/2024-06-05-june-meeting/index.md b/docs/events/wg_workshops/2024-06-05-june-meeting/index.md new file mode 100644 index 0000000..6559048 --- /dev/null +++ b/docs/events/wg_workshops/2024-06-05-june-meeting/index.md @@ -0,0 +1,16 @@ +# Provisional: UK TRE Community meeting - June 2024 + +:Provisional Date: [Wednesday 5th June 2024](https://arewemeetingyet.com/London/2024-06-05/13:30/UK%20TRE%20Community%20meeting) +:Time: 13:30 - 17:00 (to be confirmed) +:Registration: Will open after the March 2024 meeting +:Location: Online + +## Background + +## Agenda + +### Keynote + +### Community updates + +### Breakout sessions diff --git a/docs/events/wg_workshops/2024-09-02-september-meeting/index.md b/docs/events/wg_workshops/2024-09-02-september-meeting/index.md new file mode 100644 index 0000000..d338927 --- /dev/null +++ b/docs/events/wg_workshops/2024-09-02-september-meeting/index.md @@ -0,0 +1,16 @@ +# Provisional: UK TRE Community meeting - September 2024 + +:Provisional Date: [Monday 2nd September 2024](https://arewemeetingyet.com/London/2024-09-02/09:30/UK%20TRE%20Community%20meeting) +:Time: 09:30 - 17:00 (to be confirmed) +:Registration: Will open after the June 2024 meeting +:Location: Hopefully in person at [RSECon24](https://rsecon24.society-rse.org/) and online + +## Background + +## Agenda + +### Keynote + +### Community updates + +### Breakout sessions diff --git a/docs/events/wg_workshops/2024-12-10-december-meeting/index.md b/docs/events/wg_workshops/2024-12-10-december-meeting/index.md new file mode 100644 index 0000000..57828b0 --- /dev/null +++ b/docs/events/wg_workshops/2024-12-10-december-meeting/index.md @@ -0,0 +1,16 @@ +# Provisional: UK TRE Community meeting - December 2024 + +:Provisional Date: [Tuesday 3rd December 2024](https://arewemeetingyet.com/London/2024-12-03/13:30/UK%20TRE%20Community%20meeting) +:Time: 13:30 - 17:00 (to be confirmed) +:Registration: Will open after the September 2024 meeting +:Location: Online + +## Background + +## Agenda + +### Keynote + +### Community updates + +### Breakout sessions diff --git a/docs/events/wg_workshops/index.md b/docs/events/wg_workshops/index.md index 39cddbc..e6f7f93 100644 --- a/docs/events/wg_workshops/index.md +++ b/docs/events/wg_workshops/index.md @@ -1,8 +1,23 @@ # Community Workshops +## Planned meetings + +```{toctree} +:maxdepth: 1 + +2024-03-14-march-meeting/index +2024-06-05-june-meeting/index +2024-09-02-september-meeting/index +2024-12-10-december-meeting/index + +``` + +## Previous meetings + ```{toctree} -:maxdepth: 2 +:maxdepth: 1 +2023-12-05-december-meeting/index 2023-09-04-september-meeting/index 2023-06-28-june-meeting/index 2023-03-29-march-meeting/index @@ -41,6 +56,6 @@ The meetings vary slightly from session to session, however they broadly follow ### How to get involved -Sign-up for the mailing list to be kept up to date with the latest events, and more: https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=RSE-TRE-COMM&A=1 +Sign-up for the mailing list to be kept up to date with the latest events, and more: https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=UK-TRE-COMM&A=1 If you have any questions please email Hari Sood [(hsood@turing.ac.uk)](mailto:hsood@turing.ac.uk) or Simon Li [(spli@dundee.ac.uk)](mailto:s.p.li@dundee.ac.uk). diff --git a/docs/index.md b/docs/index.md index e590648..de8f023 100644 --- a/docs/index.md +++ b/docs/index.md @@ -10,11 +10,10 @@ background/index about/index ``` -```{admonition} UK-TRE September meeting +```{admonition} UK-TRE March 2024 meeting :class: attention -Registration for the [UK-TRE September meeting is now live!](events/wg_workshops/2023-09-04-september-meeting/index) -The biggest meetup of the UK TRE community to date with presentations, breakout discussions and a keynote from HDR UK on the future of TREs. +[Register for the UK-TRE March 2024 meeting](https://lu.ma/n7yybonh) (09:30 Thursday 14th March) now! ``` Welcome to the site! This site containts resources, reports, meeting notes, discussions and more associated with the UK Trusted Research Environment Community. @@ -37,8 +36,8 @@ We hope all members of the community will sign-up to this commitment. ## [](background/index) Find out more about the UK-TRE community, and how to join us -:Mailing list: https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=RSE-TRE-COMM&A=1 -:Slack channel: https://uktrecommunity.slack.com +:Mailing list: https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=UK-TRE-COMM&A=1 +:Slack channel: https://uktrecommunity.slack.com/ ## [](about/index) diff --git a/docs/requirements.txt b/docs/requirements.txt index 8dd70d6..73c55ac 100644 --- a/docs/requirements.txt +++ b/docs/requirements.txt @@ -1,4 +1,4 @@ linkify-it-py==2.0.0 myst-parser==2.0.0 pydata-sphinx-theme==0.13.3 -sphinx==6.2.1 +sphinx==7.2.6