@@ -4,6 +4,184 @@ Below are the release notes for the full RTG suite, upon which
4
4
RTG.VERSION is based. Not all features described below may be included
5
5
in this product.
6
6
7
+
8
+ RTG Core 3.10 (2018-10-29)
9
+ --------------------------
10
+
11
+ This release primarily contains smaller improvements and bugfixes.
12
+ Several of these result in command line arguments or changes to program
13
+ outputs, so check existing scripts for compatibility before
14
+ upgrading. Larger features of note:
15
+
16
+ * Several improvements to simulation tools. In particular, a new command
17
+ pedsamplesim has been included that makes it very easy to simulate
18
+ multiple samples at once, given a pedigree file. pedsamplesim
19
+ automatically simulates founder individuals, inheritance by children,
20
+ and de novo mutations.
21
+
22
+ * Java 11 compatibility testing. RTG is compatible with Java 11,
23
+ although currently we recommend Java 8 for performance reasons. Also
24
+ note that due to differences in Java Math library implementation after
25
+ Java 8, in rare situations minor output differences may be observed
26
+ when comparing results obtained using Java 8 with later Java versions.
27
+ Builds that include a bundled JRE have been updated to the latest JRE
28
+ 8u181.
29
+
30
+ There have been many other minor improvements and feature
31
+ additions. Detailed changes are listed below by area. For more
32
+ information on new features, see the RTG Operations Manual.
33
+
34
+ ## Basic Formatting and Mapping
35
+
36
+ * petrim: Now outputs read length distribution statistics.
37
+
38
+ * petrim: Fixed an incorrect filename extension being used for fragment
39
+ and overlap length distribution output files.
40
+
41
+ * map: Now allows the use of both --repeat-freq and
42
+ --blacklist-threshold at the same time.
43
+
44
+ * map: Unmapped but placed reads have had minor adjustments made to
45
+ their expected mapping position. As well as causing changes to BAM
46
+ annotations, this can cause subsequent changes to variant calling
47
+ annotations (such as AVR scores).
48
+
49
+ * map: Fix a rare crash that could occur when mapping a male sample. The
50
+ fix for this can similarly have some changes to subsequent variant calling.
51
+
52
+ * sammerge: New flag --min-read-length to permit filtering out
53
+ alignments where the read length is below the specified threshold.
54
+
55
+ * sammerge: New flag --select-read-group to include only alignments from
56
+ the specified read groups.
57
+
58
+ * sammerge: New flag --remove-duplicates to detect and remove duplicate
59
+ reads based on mapping position. This is like the duplicate detection
60
+ that the analysis tools such as variant callers normally perform on
61
+ the fly.
62
+
63
+ * sammerge: Supports --Xforce to allow overwriting existing output
64
+ files.
65
+
66
+ * sdfsubset/sdfsplit: These commands now pass SAM read group information
67
+ from the input SDF to the output SDF.
68
+
69
+
70
+ ### Variant Calling
71
+
72
+ * variant callers: The GT fields for unphased calls are now in a
73
+ normalized (numerically increasing) format. Previously the choice of
74
+ allele ordering for alleles within a GT field was somewhat arbitrary,
75
+ giving the impression of some significance where there was none.
76
+
77
+ * variant callers: Population variants loaded via --population-priors
78
+ are only used to refine complex call regions when the non-reference
79
+ allele fractions for the variant are higher than 1%. Previously the
80
+ use of a population priors source such as gnomAD that includes many
81
+ rare variants could lead to reduced sensitivity.
82
+
83
+ * variant callers: Improved the ability to identify candidate local
84
+ haplotypes when jointly calling a large number of samples or where
85
+ there is wide variation in coverage between samples. The effect of
86
+ this is greater sensitivity to rare variants such as singletons and de
87
+ novo variants.
88
+
89
+ * variant callers: Ignore SAM records where the reads have zero length.
90
+
91
+ * many: Region based SAM/BAM record retrieval could sometimes skip
92
+ records in the case of a small inter-region gap.
93
+
94
+ * segment: The --min-panel-coverage option has been renamed to
95
+ --min-norm-control-coverage (with extended functionality).
96
+
97
+ * avrbuild: New flag --annotated that allows supplying positive/negative
98
+ labels via annotations on each VCF record, as an alternative to
99
+ supplying separate positive and negative VCFs. The supported
100
+ annotation is the same as produced by vcfeval --output-mode=annotate
101
+ format.
102
+
103
+ * avrbuild: New flag --bed-regions to only read those training instances
104
+ that overlap the specified regions. This is a convenience method that
105
+ can be used to train on a specific subset of the data.
106
+
107
+
108
+ ### Variant Processing and Analysis
109
+
110
+ * svdecompose: Fixed a crash caused by records where SVTYPE=INS but
111
+ where the record did not also contain an SVLEN annotation. These
112
+ records are now ignored.
113
+
114
+ * vcfdecompose: Fixed a crash on records that did not contain a GT
115
+ format field. This also affected vcfeval when using --decompose. In
116
+ addition, the error reporting for records with invalid GT fields has
117
+ been improved.
118
+
119
+ * many: Clearer error handling for VCF records that are invalid due to
120
+ extra TABs
121
+
122
+ * rocplot: Move the legend for precision/sensitivity graphs to the left
123
+ hand side, where it is less likely to obstruct the curves themselves.
124
+
125
+ * vcfannotate: Change in matching semantics when annotating with
126
+ IDs. Now uses the span of the record rather than just the start
127
+ position.
128
+
129
+ * many: New derived annotation VAF1 that contains the VAF of the most
130
+ frequent alt allele. Being a single value annotation, it can be easily
131
+ used during AVR model building.
132
+
133
+ * vcfmerge: Fix a crash that could occur when trying to merge a record
134
+ containing duplicated alleles.
135
+
136
+
137
+ ### Other
138
+
139
+ * samplesim: Changed the behaviour when simulating from VCF records
140
+ without an AF annotation. Now these variants are ignored (i.e. never
141
+ selected for use by the sample), previously samplesim would treat all
142
+ alleles as equally likely. The old behaviour is available via new flag
143
+ --allow-missing-af.
144
+
145
+ * childsim: The misleadingly named flag --num-crossovers has been
146
+ renamed to --extra-crossovers.
147
+
148
+ * denovosim: Now allows the original and derived sample names to be the
149
+ same, in which case the sample in the output VCF is updated rather
150
+ than creating a new sample column.
151
+
152
+ * denovosim: No longer sets the DN flag to "N" for samples not receiving
153
+ the de novo mutation, as in multi-sample simulation scenarios this is
154
+ not a reliable indicator.
155
+
156
+ * denovosim: Fix bug when determining if a putative de novo site would
157
+ overlap with pre-existing variants.
158
+
159
+ * pedsamplesim: New command that allows simulating several samples in
160
+ one run according to a pedigree. This uses the methods of samplesim,
161
+ denovosim, and childsim to greatly ease the simulation of multiple
162
+ samples.
163
+
164
+ * pedstats: New flag --delimiter that can be used to output sample
165
+ identifiers with an alternative delimiter. For example, use comma as a
166
+ delimiter when directly supplying a sample list to vcfsubset
167
+ --keep-samples.
168
+
169
+ * simulation tools: Most commands now support --Xforce to overwrite
170
+ existing files.
171
+
172
+ * simulation tools: Improvements have been made to parameter validation.
173
+
174
+ * misc: Updates for compatibility with Java 11. However, for performance
175
+ reasons we recommend using Java 8 for computationally intensive
176
+ analysis such as mapping and variant calling.
177
+
178
+ * misc: Update bundled JRE to 1.8.0_181.
179
+
180
+ * misc: Improved percentage memory allocation behaviour when total
181
+ system memory can not be determined. Will now fall back to Java
182
+ default memory allocation.
183
+
184
+
7
185
RTG Core 3.9.1 (2018-05-29)
8
186
---------------------------
9
187
@@ -434,7 +612,7 @@ Major features of this release:
434
612
simulation of population-level variants (popsim), individual sample
435
613
genomes using population variants (samplesim), simulation of samples
436
614
as member of a pedigree obeying inheritance rules (childsim),
437
- simulation of de-novo variants (denovosom ), generation of a genome
615
+ simulation of de-novo variants (denovosim ), generation of a genome
438
616
given a VCF of sample variants (samplereplay), and read simulation
439
617
according to a range of sequencer parameters (readsim/cgsim).
440
618
0 commit comments