Skip to content

Commit

Permalink
TRE sustainability
Browse files Browse the repository at this point in the history
  • Loading branch information
harisood committed Dec 19, 2023
1 parent 7c21d9f commit 7c9dae6
Show file tree
Hide file tree
Showing 5 changed files with 110 additions and 89 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ There was further discussion on how to effectively check ML models within TREs.
In the case of governance, it was suggested that a project working across multiple TREs should have one singular goevrnance process.

### Next steps

- Create a 'panel' focused on specific type of data/reseearch (e.g. health, crime, financial) who can oversee specific research projects within these fields

## Raw notes
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Governance of the UK TRE Community


## Attendees

- Chair: Hari
- Note-taker: Simon
- Tim Hubbard
Expand All @@ -11,65 +11,69 @@
- ??

## Overview

### Summary

### Next steps

## Raw Notes

- UK-TRE: Aims, purposes, should it take on a polictical/advocacy role?
- NHS: already have their plans for Governance
- but looking promising so far
- but looking promising so far
- Datapact: Part of Data saves lives policy
- Not policy, but saying how NHS will treat your data
- Not policy, but saying how NHS will treat your data
- Don't want to force too much information on public: they'll think you're trying to hide something
- Public engagement: not just telling them what will happen, instead enable citizens to make policy decisions
- Interest in academia about what to do, waiting for NHS to give guidance
- UK-TRE should we lead, not just follow NHS
- Lead, provide input
- TREs are for much more than just healthcare data which NHS focusses on
- Lead, provide input
- TREs are for much more than just healthcare data which NHS focusses on
- Universal selling point of UK-TRE: Diversity of audience, and pragmatism: people that are doing something
- Danger of just listening is you don't share your existing knowledge of what will/won't work
- Should put out position statements? Say things if you don't like something? The community should reach a point where what we say is respected. More powerful than individual submissions.
- Industry groups such as ABPI, BIO
- Provide inputs, write reports, represent a community and a voice
- Provide inputs, write reports, represent a community and a voice
- Organisations need to sign up to show support
- Sign-up to UK-TRE? Or to position statements created by UK-TRE?
- E.g. IET (engineering professional institution) members can say what they're interested in on their profile. IET may respond to a Government consultation by asking members for input, and collating responses.
- Sign-up to UK-TRE? Or to position statements created by UK-TRE?
- E.g. IET (engineering professional institution) members can say what they're interested in on their profile. IET may respond to a Government consultation by asking members for input, and collating responses.
- Working groups/focus areas
- Needs resource/funding
- Does UKRI have something?
- Beyond UKRI, commercial?
- Needs resource/funding
- Does UKRI have something?
- Beyond UKRI, commercial?
- GA4GH:
- multiple levels of slices of funding
- 100s of organisations across 80 countries
- multiple levels of slices of funding
- 100s of organisations across 80 countries
- What should UK-TRE do?
- Be careful not to become just a bureaucratic institution that has some funding, people, writes reports.
- Balance
- Maybe a network that feeds up to DARE/HDR/ADR?
- USP would be it's practical, diverse, not duplicative, ideal audience for people at top to bounce ideas off
- Proper focus groups would be much more expensive
- Some funding for community to organise meetings like this
-

- Be careful not to become just a bureaucratic institution that has some funding, people, writes reports.
- Balance
- Maybe a network that feeds up to DARE/HDR/ADR?
- USP would be it's practical, diverse, not duplicative, ideal audience for people at top to bounce ideas off
- Proper focus groups would be much more expensive
- Some funding for community to organise meetings like this
-

## Roadmap plan

### Questions

- What would a solution to this problem look like?
- Ensure meetings remain attractive, not too officious
- Lightning talks good, reduces duplication
- Networking opportunities
- Long lunch
- People willing to invest time to travel
- "Stir people up and let them go"
- Beach! 🏖️
- No different from what we've got now
- More recognisable branding
- A home? What does "home" mean?
- A formal recognisable figurehead
- Ensure meetings remain attractive, not too officious
- Lightning talks good, reduces duplication
- Networking opportunities
- Long lunch
- People willing to invest time to travel
- "Stir people up and let them go"
- Beach! 🏖️
- No different from what we've got now
- More recognisable branding
- A home? What does "home" mean?
- A formal recognisable figurehead
- What resources would be needed (people, time, funds, infrastructure etc.)?
- Funding for someone to be a formal chair of UK-TRE
- Neutral funding for someone to run community, not funded directly by a single institution
- Maybe multiple people? E.g. coordinator, chair, community manager (junior/senior?), technical?
- Elected chairs to propose direction/funding? Probably too much.
- Instead have a steering committee
- Funding for someone to be a formal chair of UK-TRE
- Neutral funding for someone to run community, not funded directly by a single institution
- Maybe multiple people? E.g. coordinator, chair, community manager (junior/senior?), technical?
- Elected chairs to propose direction/funding? Probably too much.
- Instead have a steering committee
- How can this community support you in getting them?
- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively?
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
# Sight unseen: how far can we go with keeping data hidden from users?

## Overview

### Summary

### Next Steps

## Raw notes

- What are the advantages and disadvantages of hiding data from users?
- How do we minimise barriers and frustration when working with unseen data?
-
-
- X
- X
- X
Expand All @@ -17,29 +19,32 @@
- In what scenario would it be beneficial to keep data hidden?

- Federated analytics - [OpenSAFELY](https://www.opensafely.org) model. Allows you to see data that is structured the same as the original but filled with random (synthesised?) data.
- Can we provide sufficient metadata to allow for unclean or missing data?
- Additional challenge with more complex data (highly relational/linked databases)
- There is a need for code review before running on the original data
- Can we provide sufficient metadata to allow for unclean or missing data?
- Additional challenge with more complex data (highly relational/linked databases)
- There is a need for code review before running on the original data
- Who's resposibility is it to create the metadata and do the cleaning? The data provider? The TRE (probably not)?
- On the question of how far we can take this:
- It can be possible, but there are limitations. Including reducing the chance of the results.
- It can be possible, but there are limitations. Including reducing the chance of the results.
- Pros of hiding data:
- increase trust in research
- potential for higher quality research (no p-hacking, more hypothesis testing, less data mining, etc)
- increase trust in research
- potential for higher quality research (no p-hacking, more hypothesis testing, less data mining, etc)
- There are some doubts about the value/need for this. Aren't TREs with anonymised data enough?
-
-
- X
- X
- X

### Roadmap plan

#### Questions

- What would a solution to this problem look like?
- What resources would be needed (people, time, funds, infrastructure etc.)?
- How can this community support you in getting them?
- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively?

#### Notes

- Something along the lines of the OpenSAFELY model could work
- Requires trust in the data providers and researchers
- Limitations of types of data and types of analyses
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,108 +19,119 @@

### Summary

Sustainability needs to be long term, but how do you plan for it when the scenario may change in 5 years?
There is also an issue with research, this is a service yet funding requires teams to appear to be doing something new each time, and funders often prefer not to pay for infrastructure (also challenges with cost estimates and under/over expenditures).

There are several variables and questions about whether they should be free at point of use (distributing against overheads), or whether to employ a membership user model, a project fee model, standard features being free but charging for high demanding ones or something else.
In all cases at least some core funding is required to ensure continuity, specialisation and quality.

What we want to ensure is that a public service exists.

### Next Steps

- Create a roadmap that focuses on:
- Technical skillsets
- Information governance requirements
- 10 year funding plan

## Raw notes

SN: Sustainability from funding perspective beyond the initial 5 years
Sustainability from funding perspective beyond the initial 5 years
But what are things going to look like in 5 years time

UBT: CL centrally funded model
CL centrally funded model
Service in place, refreshed but need to appear to do something different each time to secure funding.

**Why different?**
How costing then? Free at point of use, cost distributed against overheads.
Constrain in the cloud?

SN: Barts recover work space costs from research projects, distributed central cost on a membership/license/user model
Barts recover work space costs from research projects, distributed central cost on a membership/license/user model
Difference between model for internal and external users.

DPUK: Standard provision free, high storage/compute needs to be recovered
Standard provision free, high storage/compute needs to be recovered
More paperwork to create and chase invoices.

SN: no funders like paying for infrastructure
no funders like paying for infrastructure

What counts as core if it was funded?
Duties imposed as data controllers law, or interpretation runs counter to wants of researchers

ES: Folk specialising, if it doesn't get funded for the future that capability is lost.
Folk specialising, if it doesn't get funded for the future that capability is lost.

SN: Regional SDE model might lead the way of costing-funding-recovery
Regional SDE model might lead the way of costing-funding-recovery

RH: Some central funding
Some central funding

Specialist areas - operational team
Different environments work differently from researcher perspective

Sustain people

JG: Business and operations to use OS TRE safely and securely
Business and operations to use OS TRE safely and securely

SN: what is the perfect TRE/SDE environment future consolidation
what is the perfect TRE/SDE environment future consolidation

SN: Software development can be amortised across the community
Software development can be amortised across the community

SERP tenant

RH: Training component

SN: Who provides desk-side support
Training component

ES: Tracking usage, egress process, layers of tools and processes that need to be in place
Who provides desk-side support

BT: In/out nature of TRE, tiered sensitivity? Commercial sensitivity. Has auditability in the TRE, does it need to be?
Tracking usage, egress process, layers of tools and processes that need to be in place

JG: Why different for UCL TRE?
In/out nature of TRE, tiered sensitivity? Commercial sensitivity. Has auditability in the TRE, does it need to be?
Why different for UCL TRE?

BT: Difference in TRE makes funding case easier, adding something new made it more interesting.
Difference in TRE makes funding case easier, adding something new made it more interesting.

SN: Using research funding to backfill
Using research funding to backfill

A: Estimate in advance what project is likely to use, operational costs, usually completely wrong and go over project
Estimate in advance what project is likely to use, operational costs, usually completely wrong and go over project
Not sustainable to go consistently over budget
Bill after usage is best, but challenging for proposal/funding

BT: Cliff edge, have funding but only sufficient for 1 year not 3 years of project.
Cliff edge, have funding but only sufficient for 1 year not 3 years of project.

JG: Following Access to HPC model what would you put on a
Following Access to HPC model what would you put on a

SN: What can you take off the board if problem is solved strategically
What can you take off the board if problem is solved strategically
Good training for Data scientists: SC like training relevant to disciplines

BT: Seems like we're trying to boil the ocean
Seems like we're trying to boil the ocean
VDI, Excel may be R, Stata
Developing things to deal with core use case

ES: Core capabilities, exceptional stuff is great, but majority, early stage users, standardise and simplify.
Core capabilities, exceptional stuff is great, but majority, early stage users, standardise and simplify.

SN: Whatever it is, what' smissing the ability to understand data. GIGO
Whatever it is, what' smissing the ability to understand data. GIGO

ES: Standardisation of data makes it seem simpler than it is, reproducibility?
Standardisation of data makes it seem simpler than it is, reproducibility?

BT: AI/ML store data for XX years, is it readable in that time?
AI/ML store data for XX years, is it readable in that time?

ES: WHo picks up the storage costs for the data.
WHo picks up the storage costs for the data.

RH: Guidance
Guidance

JG: How can we make it more transparent
How can we make it more transparent

ES: Constrained with the current model.
Constrained with the current model.

SN: Guidance provided by RCs, institutional risk as the org have underwritten the project.
Guidance provided by RCs, institutional risk as the org have underwritten the project.

---

Round 2

TT:
Concerned about being able to provide a service, don't control budgets
Sustainabiltiy of providing a public service, rather than generating a business case

PCN: SNSDE comes under DH budgets, makes things easier
SNSDE comes under DH budgets, makes things easier

TT: HDRUK MRC led 20 year vision 5 year cycle
HDRUK MRC led 20 year vision 5 year cycle
UKBB core underpinning funding
Fund TREs for 3-5 years for specific projects
Specific use cases not currently supported
Expand All @@ -130,30 +141,30 @@ Individual researchers and work with them and the RO.
?Provide underpinning capacity?

What is ONS Model?
TT: Free at point of access
Free at point of access
Don't know how the budget is secured
Funding comes through different sources ADR UK
Research proposal, existing staff funding or contracted.
For commercial and public researchers usage has to be for public good, commit to publishing and not for profit
Virtual machines provided some policy for standardising storage/compute available
Trying to enable research

PCN: Driven by what researchers ask for
Driven by what researchers ask for
Intrinsic limit on budget call
Budget for a specific network/platform
Leverage external investment
Some Pharma match funding
Universities also fund

PCN: Mov to long term funding
Mov to long term funding
Strategic level of funding, buffered from long-term budget
Hub large funding but cliff-edged

PCN: Free at the point of use
Free at the point of use
Incentivised-disinsentivised, equity of access
Power users can over-consume, less accountability not having to justify use

TT: consuming data token publication and harvesting data for private use
consuming data token publication and harvesting data for private use
Free at point of access so data is freely accessible
reminder: Don't offer data for commercial use

Expand All @@ -167,7 +178,7 @@ All TREs have these issues, share the solutions
More automation -IDS (Integrated Data Service- SRS Secure Research Service
Free at point of use?? Cuts out some of the applications automated validation of inputs

PCN: Understand the whole pathway
Understand the whole pathway
Fix one part and it just shows the next bottleneck
Fraunhoffer 1/3-1/3-1/3 lights_on-academic-commercial_activity
Sustainability, prime an initiative without committing to long term investment
Expand All @@ -176,13 +187,12 @@ More people - more monkeys on typewriters

Over focus on the medical use case currently, needs to rebalance.

PCN:
Better understanding and economy of scale from small numbers.
Focus critical mass on small number
DARE UK would create a TRE to handle data as an offering

What is a TRE?
PCN: At what point does a federated TRE network become a single TRE?
At what point does a federated TRE network become a single TRE?
TT: At the point at which you have seamless transition between TREs?

Trust that the analysis/code is running as intended?
Expand Down
Loading

0 comments on commit 7c9dae6

Please sign in to comment.