Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cram: Record counter field description is possibly off-by-one #807

Open
zaeleus opened this issue Jan 31, 2025 · 1 comment · May be fixed by #822
Open

cram: Record counter field description is possibly off-by-one #807

zaeleus opened this issue Jan 31, 2025 · 1 comment · May be fixed by #822
Labels

Comments

@zaeleus
Copy link

zaeleus commented Jan 31, 2025

This is in regard to CRAM format specification (version 3.1) (2024-09-04).

The record counter field in both the container (§ 7 "Container header structure") and slice headers (§ 8.5 "Slice header block") is described to be a "1-based sequential index of records in the file/stream."

However, current implementations write 0-based values, e.g., htslib:

$ samtools --version | head -2
samtools 1.21
Using htslib 1.21

$ (
samtools view --output-fmt cram --output-fmt-option seqs_per_slice=2 <<EOF
@HD	VN:1.6
r1	4	*	0	255	*	*	0	0	NNNN	!!!!
r2	4	*	0	255	*	*	0	0	NNNN	!!!!
r3	4	*	0	255	*	*	0	0	NNNN	!!!!
EOF
) | cram_dump - | grep "Rec counter"
    Rec counter:       0
        Rec counter      0
    Rec counter:       2
        Rec counter      2
    Rec counter:       0

I would expect the record counters to be 1 and 3 if they were to be 1-based. Is the description in the spec incorrect, or is the field missing a more thorough explanation on how its supposed to be tracked?

@zaeleus zaeleus changed the title cram: Questionable record counter field description cram: Record counter field description is possibly off-by-one Feb 4, 2025
@jkbonfield
Copy link
Contributor

jkbonfield commented Mar 25, 2025

It looks like my htsjdk produced files start counting from 0 too. It's probably where I copied the htslib implementation (via scramble).

Thanks. Will fix (the spec).

jkbonfield added a commit to jkbonfield/hts-specs that referenced this issue Mar 25, 2025
This matches io_lib, htslib and htsjdk's implementations, none of
which adhered to the documented 1-based coordinates.

Fixes samtools#807
@jkbonfield jkbonfield linked a pull request Mar 25, 2025 that will close this issue
@jkbonfield jkbonfield moved this to New items in GA4GH File Formats Mar 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: New items
Development

Successfully merging a pull request may close this issue.

2 participants