@@ -91,8 +91,7 @@ the following normalization rules apply:
91
91
and `Alternate Allele Sequence `.
92
92
93
93
#. one is empty, the input Allele is an insertion (empty `reference
94
- sequence `) or a deletion (empty `alternate sequence `). Store the length
95
- of the non-empty sequence: this is the `Repeat Subunit Length `. Continue to
94
+ sequence `) or a deletion (empty `alternate sequence `). Continue to
96
95
step 3.
97
96
98
97
#. Determine bounds of ambiguity.
@@ -112,12 +111,35 @@ the following normalization rules apply:
112
111
113
112
#. Construct a new Allele covering the entire region of ambiguity.
114
113
115
- a. If the ` reference sequence ` is empty, this is an unambiguous
116
- insertion. Return a new `Allele ` with the trimmed `alternate
117
- sequence ` as a ` Literal Sequence Expression `.
114
+ a. If the expanded ` Reference Allele Sequence ` is empty, this is an unambiguous insertion.
115
+ Return a new `Allele ` with the trimmed `Alternate Allele Sequence ` as a ` Literal
116
+ Sequence Expression `.
118
117
119
- #. Otherwise, return a new `Allele ` using a `reference length
120
- expression `, using a `Location ` specified by the coordinates
118
+ #. Otherwise, find the greatest common denominator between the length of the expanded `Reference
119
+ Allele Sequence ` and the expanded `Alternate Allele Sequence `. This is the `repeat subunit length `.
120
+
121
+ #. If the Allele is a deletion (the `Alternate Allele Sequence ` is shorter than the
122
+ `Reference Allele Sequence `) return a new Allele using a `Location ` specified by the coordinates
123
+ of the `left_roll_bound ` and `right_roll_bound `, a `length ` specified by the length of the
124
+ `Alternate Allele Sequence `, and a `repeat subunit length ` as calculated in the prior step.
125
+
126
+ #. If the Allele is an insertion (the `Reference Allele Sequence ` is shorter than the
127
+ `Alternate Allele Sequence `), check that the first `repeat subunit length ` number of characters
128
+ of the `Reference Allele Sequence ` can be cycled to reconstruct the `Alternate Allele Sequence `.
129
+
130
+ 1. If so, return a new Allele using a `Location ` specified by the coordinates of the `left_roll_bound `
131
+ and `right_roll_bound `, and a `Reference Length Expression ` with a `length ` specified by the length
132
+ of the `Alternate Allele Sequence `, and a `repeat subunit length ` as previously calculated.
133
+
134
+ #. If not, return a new Allele using a `Location ` specified by the coordinates of the `left_roll_bound `
135
+ and `right_roll_bound `, and a `Literal Sequence Expression ` with the expanded `Alternate Allele Sequence `.
136
+
137
+
138
+ return a new Allele using a `Location ` specified by the coordinates
139
+ of the `left_roll_bound ` and `right_roll_bound `, a `length ` specified by the length of the
140
+ `Alternate Allele Sequence `, and a `repeat subunit length ` as calculated in the prior step.
141
+
142
+ using a `Location ` specified by the coordinates
121
143
of the `left_roll_bound ` and `right_roll_bound `, a `length `
122
144
specified by the length of the `alternate allele `, and a
123
145
`repeat subunit length ` as determined in step 2c.
0 commit comments