Skip to content

Commit

Permalink
Merge branch 'main' into 20-deliverables
Browse files Browse the repository at this point in the history
  • Loading branch information
harisood authored Mar 12, 2024
2 parents 2ad0850 + 28e7754 commit 5bd4480
Show file tree
Hide file tree
Showing 44 changed files with 2,335 additions and 159 deletions.
8 changes: 7 additions & 1 deletion .github/ISSUE_TEMPLATE/general.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,20 @@ body:
This issue will be added to the backlog of our project board. If you would like to take it on, please assign it to yourself!
All issues need to have someone assigned to/owning them in order to progress. Whilst this is not the case we'll keep the issue safe in the backlog!
- type: textarea
id: detail
attributes:
label: Detail
description: Please explain your issue in detail
validations:
required: true
- type: textarea
id: actions
attributes:
label: Actions
description: Please add any actions to be undertaken below. You can create a checklist by typing `- [ ] x` for each action point
validations:
required: true
- type: textarea
id: tags
attributes:
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ This repository is for the UK-TRE community website hosted on Read the Docs http

Anyone can join our mailing list and attend our meetings, you do not need to provide any information other than your email address.

:Mailing list: https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=RSE-TRE-COMM&A=1
:Mailing list: https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=UK-TRE-COMM&A=1
:Slack channel: https://ukrse.slack.com/archives/C045ETUPPD0

# :family: Community and Support
Expand Down
2 changes: 1 addition & 1 deletion docs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXOPTS ?= -W
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build
Expand Down
2 changes: 1 addition & 1 deletion docs/_templates/footer-links.html
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<a href="https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=RSE-TRE-COMM&A=1"
<a href="https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=UK-TRE-COMM&A=1"
><i class="fa-solid fa-envelope"></i> Mailing list</a
><br />
<a href="https://ukrse.slack.com/archives/C045ETUPPD0"
Expand Down
2 changes: 1 addition & 1 deletion docs/background/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,5 @@ UK-TRE aims to encourage open collaborations and sharing of innovative ideas to

Anyone can join our mailing list and attend our meetings, you do not need to provide any information other than your email address.

:Mailing list: https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=RSE-TRE-COMM&A=1
:Mailing list: https://www.jiscmail.ac.uk/cgi-bin/wa-jisc.exe?SUBED1=UK-TRE-COMM&A=1
:Slack channel: https://ukrse.slack.com/archives/C045ETUPPD0
18 changes: 15 additions & 3 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@
myst_enable_extensions = ["fieldlist", "linkify"]
myst_linkify_fuzzy_links = False

# Automatically create anchors for in-page headings up to level 3
myst_heading_anchors = 4

# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
Expand All @@ -42,6 +44,16 @@
}

# -- Link checker configuration
# https://www.swansea.ac.uk/the-university/location/#bay-campus=is-expanded
# is a JavaScript only anchor
linkcheck_anchors_ignore = ["bay-campus=is-expanded"]

linkcheck_ignore = [
# GitHub CI linkchecker seems to be blocked
r"https://www.turing.ac.uk/.*",
r"https://www.hpe.com/.*",
r"https://csrc.nist.gov/.*",
]

# These pages use in-page JavaScript anchors which aren't seen by the link checker
linkcheck_anchors_ignore_for_url = [
r"https://www\.swansea\.ac\.uk/the-university/location/",
r"https://arewemeetingyet\.com/.+",
]
26 changes: 24 additions & 2 deletions docs/events/index.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,31 @@
# Events

```{raw} html
<iframe
src="https://teamup.com/ksbqmbhymxdbu454aw?view=mw8&showLogo=0&showProfileAndInfo=0&showSidepanel=1&disableSidepanel=1&showMenu=1&showAgendaHeader=1&showAgendaDetails=0&showYearViewHeader=1"
style="width: 100%; height: 800px; border: 1px solid #cccccc"
loading="lazy"
frameborder="0">
</iframe>
<div style="text-align:right"><a href="https://teamup.com/ksbqmbhymxdbu454aw" target="_blank">Open in new window</a></div>
```

You can subscribe to the calendar, e.g. in Outlook or Google Calendar, by importing [this ICS link](https://ics.teamup.com/feed/ksbqmbhymxdbu454aw/0.ics) to your calendar.

You can also subscribe to a subset of events:

- [Official events](https://ics.teamup.com/feed/ksbqmbhymxdbu454aw/13011531.ics)
- [Other TRE events](https://ics.teamup.com/feed/ksbqmbhymxdbu454aw/13011530.ics)
- [WG - Community management](https://ics.teamup.com/feed/ksbqmbhymxdbu454aw/13014371.ics)
- [Working groups](https://ics.teamup.com/feed/ksbqmbhymxdbu454aw/13014372.ics)

Here you'll find events the community is organising and engaged in. You can also find reports on past events hosted by the community, as well as a schedule of upcoming events and information on how to get involved.

```{toctree}
:maxdepth: 2
wg_workshops/index
```

Here you'll find events the community is organising and engaged in. You can also find reports on past events hosted by the community, as well as a schedule of upcoming events and information on how to get involved.
1 change: 0 additions & 1 deletion docs/events/wg_workshops/2023-03-29-march-meeting/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,6 @@ The group discussed first the feasibility of a common language around risk and t

- [Alan Turing Institute](https://arxiv.org/pdf/1908.08737.pdf)
- Sheffield used this as the basis of their system for assessing risk.
- [NIST RMF](https://csrc.nist.gov/projects/risk-management/about-rmf)
- [NCSC](https://www.ncsc.gov.uk/collection/risk-management)
- [Harvard DataTags](https://github.com/IQSS/DataTaggingLibrary)
- [UK Data Service data types](https://ukdataservice.ac.uk/help/access-policy/types-of-data-access/)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ _Chair: Will Crocombe (RISG Consulting)_
- 3 - weak pseudo
- 4 - public
- Dropping down tiers, things become easier. Turing paper on this - Sheffield used this as the basis of their system for assessing risk.
- https://zenodo.org/record/7754459
- [Alan Turing Institute paper](https://arxiv.org/pdf/1908.08737.pdf)
- Importance of agreed risk classification with federation, and agreement on risk appetite
- [NIST RMF](https://csrc.nist.gov/projects/risk-management/about-rmf)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Current state of the art re data linkage/federation/AI&ML&LLM across infrastructures: federation, governance, safe output methods

## Overview

### Summary

Issues about federation of datasets were discussed, including identifying different datasets across multiple systems, how to collect identifiable information robustly, and how we can link up different approaches across the 4 nations effectively.

There was further discussion on how to effectively check ML models within TREs.

In the case of governance, it was suggested that a project working across multiple TREs should have one singular governance process.

### Next steps

- Create a 'panel' focused on specific type of data/research (e.g. health, crime, financial) who can oversee specific research projects within these fields

## Raw notes

### Data Linkage

#### How do you go about the NHS Number?

- Uses NHS Standard NF5, after 3 they went to manual to track through the system.
- Issues with health and non-health data

#### Names such as Dave / David can cause problems.

- Linksmart is a solution for this.
- Collecting Crime Data

#### Scotland's Approach

- a national ID number

### Federation between datasets

- Identifying with confidence across TREs is important
- Problem: Linking health with something else is problematic to match up and link it with addresses and names
- Separation functions
- Person has all the identifying information, but they do not have the data
- TREs communications between each other need specific criteria, Scotland has 5 TREs
- Having more than two, and introducing a central one is a possibility
- Issues with identifying A-B data sets across multiple systems
- Seeding Death Data -- David and Debra Smith: D. Smith & D. Smith causes gender incompatibility issues
- National Drug Treatment Data -- At source they only collected initials 'D.S.', Gender and MM/YYYY of DOB. Deidentifying can cause linking problems. Education to non-education where they don't have their common 'number' -- how confident can we be that Participant A is the same participant in another TRE? If you're not sharing names & addresses
- Bringing in NHS data and also pseudo anonymise it -- how can you work with it without a key?
- Once you got a data linkage -- bringing the different data types into a data set (TRE). E.g. Linking mental health data and shopping data, if you anonymise that and have their own key -- they can do it anonymously for external sources
- Education data between England, Scotland and Wales might use different notations
- Residential Data can be used as a key
- 'E-child' trying to link the NHS with the Department of Education

### AI & ML

- People misunderstand the terms AI & ML with 'Statistical Modeling'
- Based on risk factors you can determine 70% precision pre-diabetic chance
- Accessing 'clinical like data' with similar terminology to mimic clinic systems
- AI -- Offline AI: you can have an offline machine learning model -- yes
- Would multiple AIs learn the same thing on same data sets? -- no
- You can make it work with a shared API though (Stroke Predicition)
- APRs -- 8-9 expensive centre
- Different type of interpretation of ML, ML data on health 'takes your job', ML data on other scenarios might be socially acceptable
- Pattern finding models are popular and precise, this is lacking in statistical modeling
- At the end of the day, medical data ML is not understood why it gives that result
- Checking models are problematic and difficult, unsure results and unsure contents of the model begs the question of the model's authenticity

### Governance

- Process is repeated a lot, no committee talks to each other and are a separate entity
- Cannot start work unless approved
- Doing a project between TREs, each TRE will have an approval process, ideally a multi TRE Project requires a single approval process, this decision should be approved across the other one

#### What would a solution to this problem look like?

- Current state of the art is the overarching question -- needs a TRE panel to decide what is state of the art
- Single 'panel' on a specialty (e.g. health, crime) who deal with specific projects, additionally members of the national TRE supervision
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Cloud vs on-prem TREs: costs, constraints, pros & cons

## Overview

### Summary

The main decision drivers are security and cost.
Cloud is more flexible for projects with different funding sources and does not require an expensive data centre for research institutions but does not offer the highest levels of security.

A potential solution is a hybrid model where you get a cloud-like infrastructure on an on-prem compute.

Cloud provision via Jisc (as oppose to direct with the cloud provider) can be cheaper and it also handles SSO: https://www.jisc.ac.uk/forms/uk-access-management-federation-sign-up#
Resources: Google RADLab: https://cloud.google.com/blog/topics/public-sector/googles-new-rad-lab-solution-helps-spin-cloud-projects-quickly-and-compliantly

### Next steps

- Develop a roadmap plan for a hybrid, cloud-agnostic model

## Raw Notes

- Compute capacity/ data centres for advanced ML projects is expensive for research institutions
- Credits make it easier to use cloud for projects with different funding sources
- Could a good solution be a hybrid model where you get a cloud-like infrastructure on an on-prem compute
- So could be completely disconnected from internet for high security
- Google have set something like this up at Sanger
- Factors determining on-prem vs cloud
- security
- cost
- Cloud provision via Jisc (as oppose to direct with the cloud provider) can be cheaper and it also handles SSO: https://www.jisc.ac.uk/forms/uk-access-management-federation-sign-up#
- Resources: Google RADLab: https://cloud.google.com/blog/topics/public-sector/googles-new-rad-lab-solution-helps-spin-cloud-projects-quickly-and-compliantly

### Roadmap plan

#### Questions

- What would a solution to this problem look like?
- What resources would be needed (people, time, funds, infrastructure etc.)?
- How can this community support you in getting them?
- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively?

#### Notes

- hybrid model (see above)
- Solution that is cloud-agnostic and could also run on on-prem hardware
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
# Governance of the UK TRE Community

## Overview

### Summary

The discussion centred about the purpose and governance of the community, trying to reach a balance between conveyors but still provide enough content and direction not to be an “empty” place.

Universal selling point of UK-TRE: Diversity of audience, and pragmatism: people that are doing something.
Danger of just listening is you don’t share your existing knowledge of what will/won’t work.

Should we put out position statements? Say things if you don’t like something? The community should reach a point where what we say is respected.
More powerful than individual submissions.

What should UK-TRE do?
Be careful not to become just a bureaucratic institution that has some funding, people, writes reports.

Maybe a network that feeds up to DARE/HDR/ADR?
USP would be it’s practical, diverse, not duplicative, ideal audience for people at top to bounce ideas off.
Proper focus groups would be much more expensive.

Some funding for the community to organise meetings like this is needed.

### Next steps

- Secure funding for person time for the community
- Establish a steering group for the community

## Raw Notes

- UK-TRE: Aims, purposes, should it take on a political/advocacy role?
- NHS: already have their plans for Governance
- but looking promising so far
- Datapact: Part of Data saves lives policy
- Not policy, but saying how NHS will treat your data
- Don't want to force too much information on public: they'll think you're trying to hide something
- Public engagement: not just telling them what will happen, instead enable citizens to make policy decisions
- Interest in academia about what to do, waiting for NHS to give guidance
- UK-TRE should we lead, not just follow NHS
- Lead, provide input
- TREs are for much more than just healthcare data which NHS focusses on
- Universal selling point of UK-TRE: Diversity of audience, and pragmatism: people that are doing something
- Danger of just listening is you don't share your existing knowledge of what will/won't work
- Should put out position statements? Say things if you don't like something? The community should reach a point where what we say is respected. More powerful than individual submissions.
- Industry groups such as ABPI, BIO
- Provide inputs, write reports, represent a community and a voice
- Organisations need to sign up to show support
- Sign-up to UK-TRE? Or to position statements created by UK-TRE?
- E.g. IET (engineering professional institution) members can say what they're interested in on their profile. IET may respond to a Government consultation by asking members for input, and collating responses.
- Working groups/focus areas
- Needs resource/funding
- Does UKRI have something?
- Beyond UKRI, commercial?
- GA4GH:
- multiple levels of slices of funding
- 100s of organisations across 80 countries
- What should UK-TRE do?
- Be careful not to become just a bureaucratic institution that has some funding, people, writes reports.
- Balance
- Maybe a network that feeds up to DARE/HDR/ADR?
- USP would be it's practical, diverse, not duplicative, ideal audience for people at top to bounce ideas off
- Proper focus groups would be much more expensive
- Some funding for community to organise meetings like this

### Roadmap plan

#### Questions

- What would a solution to this problem look like?
- Ensure meetings remain attractive, not too officious
- Lightning talks good, reduces duplication
- Networking opportunities
- Long lunch
- People willing to invest time to travel
- "Stir people up and let them go"
- Beach! 🏖️
- No different from what we've got now
- More recognisable branding
- A home? What does "home" mean?
- A formal recognisable figurehead
- What resources would be needed (people, time, funds, infrastructure etc.)?
- Funding for someone to be a formal chair of UK-TRE
- Neutral funding for someone to run community, not funded directly by a single institution
- Maybe multiple people? E.g. coordinator, chair, community manager (junior/senior?), technical?
- Elected chairs to propose direction/funding? Probably too much.
- Instead have a steering committee
- How can this community support you in getting them?
- What working groups/orgs are already working on this, if any? How can we collaborate with them effectively?
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Addressing data harmonisation between different datasets: do TREs have a role?

## Raw notes

### Handwritten notes

Transcripted by CMWG team

Data+Analysis=Timely Processing

- Harmonized/OMOPed
- TRE governanced barriers
- Reliability-validated?
- TRE role:cross project share
- DMOPin data sources & adding TRE Specific terms into main repositories
- Mapping tools
- TREs can delegate (CoConnect)
- Discovery
- Feasability
- Clinical input
Loading

0 comments on commit 5bd4480

Please sign in to comment.