Skip to content

Commit ed40a61

Browse files
committed
updates to landing page
1 parent 37b0d46 commit ed40a61

File tree

6 files changed

+34
-35
lines changed

6 files changed

+34
-35
lines changed

docs/README.md

+20-15
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,32 @@
1-
# Refget specifications
21

3-
## What is refget?
2+
![GA4GH logo](img/ga4gh-logo.png){ width="300" align=right }
43

5-
Refget is a protocol for identifying and distributing reference biological sequences.
6-
It currently consists of 2 standards:
74

8-
1. [Refget sequences](sequences.md): a GA4GH-approved standard for individual sequences
9-
2. [Refget sequence collections](seqcol.md): a standard for collections of sequences, under review
5+
# Refget specifications
106

11-
<img src="img/seqcol_abstract_simple.svg" alt="Refget abstract" class="img-responsive">
7+
## What is refget?
128

9+
Refget is a set of GA4GH standards for identifying and distributing reference biological sequences.
10+
It consists of these standards:
1311

14-
## What is the refget sequences standard?
1512

16-
The original refget standard, now called *Refget sequences*, handles sequences only.
17-
Refget sequences enables access to reference sequences using an identifier derived from the sequence itself.
13+
| Standard | Description | Status |
14+
| ----------- | ------------------------------------ | |
15+
| [Refget sequences](sequences.md) | For individual sequences | :white_check_mark: v1.0 Approved in 2021 <br>:white_check_mark:&nbsp;v2.0&nbsp;Approved in 2024 |
16+
| [Refget sequence collections](seqcol.md) | For collections of sequences | :white_check_mark: v1.0 Approved in 2025 |
17+
| Refget pangenomes | For collections of sequence collections | :fontawesome-solid-gears: Currently in process |
1818

19+
## What is the main purpose of the refget project?
1920

20-
## What is the refget sequence collections standard?
21+
Refget standards help to **identify**, **retrieve**, and **compare** reference sequences, like a reference genome. Key principles include:
2122

22-
*Refget sequence collections*, or `seqcol` for short, standardizes unique identifiers for collections of sequences. Seqcol identifiers can be used to identify genomes, transcriptomes, or proteomes -- anything that can be represented as a collection of sequences. The seqcol protocol provides:
23+
- Reference data, including sequences and collections of sequences, are identified using cryptographic digest-based identifiers that are **derived from the data itself**. This allows reference data to be identified without requiring a centralized accessioning authority.
24+
- Refget standards can be used for any type of sequences: DNA, RNA, protein, etc -- anything that can be represented as a string of characters.
25+
- Refget standards also specify **retrieval APIs**, providing a mechanism for retrieving a sequence or collection if you have its identifier.
26+
- Refget sequence collections also provides a programmatic approach to assessing compatibility among sequence collections.
2327

24-
- implementations of an algorithm for computing sequence identifiers;
25-
- a lookup service to retrieve sequences given a seqcol identifier
26-
- programmatic approach to assessing compatibility among sequence collections.
28+
This image shows how the Refget Sequences standard is used by the Sequence Collections standard. First, sequences are digested to yield a deterministic identifier. These sequence identifiers are then used, together with their names, to create an identifier for a collection.
2729

30+
<figure>
31+
<img src="img/seqcol_abstract_simple.svg" alt="Refget abstract" class="img-responsive">
32+
</figure>
File renamed without changes.

docs/img/ga4gh-logo.png

32.1 KB
Loading

docs/seqcol.md

+7-18
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,14 @@
11
---
2-
title: Seqcol specification version 0.1.0
2+
title: Refget Sequence Collections v1.0.0
33
---
44

5-
<!-- Table of contents:
6-
* The generated Toc will be an unordered list
7-
{:toc} -->
8-
9-
# Seqcol specification version 0.1.0
10-
11-
<!-- Table of contents:
12-
13-
[TOC] -->
14-
15-
## Specification version
16-
17-
This specification is in **DRAFT** form. This is **NOT YET AN APPROVED GA4GH specification**. This document is **formal technical explanation for implementers**. See also:
18-
19-
- [Architectural decision record](decision_record.md), a chronological record of spec decisions.
20-
- [Sequence collection rationale](seqcol_rationale.md), motivation for our major design decisions.
5+
# Refget Sequence Collections v1.0.0
216

227
## Introduction
238

249
Reference sequences are fundamental to genomic analysis.
2510
To make their analysis reproducible and efficient, we require tools that can identify, store, retrieve, and compare reference sequences.
26-
The primary goal of the *Sequence Collections* (seqcol) project is **to standardize identifiers for collections of sequences**.
11+
The primary goal of the *Refget Sequence Collections* (seqcol) project is **to standardize identifiers for collections of sequences**.
2712
Seqcol can be used to identify genomes, transcriptomes, or proteomes -- anything that can be represented as a collection of sequences.
2813
A common example and primary use case of sequence collections is for a reference genome, so this documentation sometimes refers to reference genomes for convenience; really, it can be applied to any collection of sequences.
2914

@@ -66,6 +51,10 @@ Building on refget, the sequence collections specification introduces foundation
6651
- **Genome browser integration**: *As a genome browser, I use one sequence collection for the displayed coordinate system and want to check if a digest representing a given BED file's coordinate system is compatible with it.*
6752
- **Annotating unknown references**: *As a data processor, I encounter input data without reference genome information and want to generate a sequence collection digest to attach, enabling further processing with seqcol features.*
6853

54+
## Architectural decision record
55+
56+
For a chronological record of decisions related to this specification, see the [Architectural decision record](decision_record.md).
57+
6958
## Definitions of key terms
7059

7160
### General terms

docs/sequences.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ title: refget protocol
44
suppress_footer: true
55
---
66

7-
# Refget API Specification v2.0.0
7+
# Refget Sequences v2.0.0
88

99
## Introduction
1010

mkdocs.yml

+6-1
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ navbar:
3434
href: contributing
3535

3636
theme:
37-
logo: img/seqcol_logo.svg
37+
logo: img/ga4gh-logo-dark-bg.png
3838
favicon: img/seqcol_logo.svg
3939
name: material
4040

@@ -54,9 +54,14 @@ extra_css:
5454
- stylesheets/extra.css
5555

5656
markdown_extensions:
57+
- attr_list
58+
- md_in_html
5759
- admonition
5860
- pymdownx.highlight:
5961
use_pygments: true
62+
- pymdownx.emoji:
63+
emoji_index: !!python/name:material.extensions.emoji.twemoji
64+
emoji_generator: !!python/name:material.extensions.emoji.to_svg
6065
- pymdownx.superfences:
6166
custom_fences:
6267
- name: mermaid

0 commit comments

Comments
 (0)