6th CRAN release: new tree intermediate subpopulations option, tree fitting, broad row/col name transfers, bug fixes
Latestbnpsd 1.2.3.9000 (2021-02-16)
- Documentation updates:
- Fixed links to functions, in many cases these were broken because of incompatible mixed Rd and markdown syntax (now markdown is used more fully).
bnpsd 1.3.0.9000 (2021-03-24)
- Added support for intermediate subpopulations related by a tree
- New function
draw_p_subpops_tree
is the tree version ofdraw_p_subpops
. - New function
coanc_tree
calculates the true coancestry matrix corresponding to the subpopulations related by a tree. - Function
draw_all_admix
has new argumenttree_subpops
that can be used in place ofinbr_subpops
(to simulated subpopulation allele frequencies usingdraw_p_subpops_tree
instead ofdraw_p_subpops
). - Note: These other functions work for trees (without change) because they accept arbitrary coancestry matrices (param
coanc_subpops
) as input, so they work if they are passed the matrix thatcoanc_tree
returns:coanc_admix
,fst_admix
,admix_prop_1d_linear
,admix_prop_1d_circular
.
- New function
- Functions
admix_prop_1d_linear
andadmix_prop_1d_circular
, whensigma
is missing (and therefore fit to a desiredcoanc_subpops
,fst
, andbias_coeff
), now additionally return multiplicativefactor
used to rescalecoanc_subpops
.
bnpsd 1.3.1.9000 (2021-04-17)
It's Fangorn Forest around here with all the tree updates!
- Added these functions:
fit_tree
for fitting trees to coancestry matrices!scale_tree
to easily scale coancestry trees and check for out-of-bounds values.tree_additive
for calculating "additive" edges for probabilistic edge coancestry trees, and also the reverse function .- This already existed as an internal, unexported function used mainly by
coanc_tree
, but now it's renamed, exported, and well documented!
- This already existed as an internal, unexported function used mainly by
- Added support of
$root.edge
to treephylo
objects passed to these functions:coanc_tree
: edge is a shared covariance value affecting all subpopulations.draw_all_admix
anddraw_p_subpops_tree
: if root edge is present, functions warn that it will be ignored.
- Functions
admix_prop_1d_linear
andadmix_prop_1d_circular
: debugged an edge case wheresigma
is small but not zero and numerically-calculated densities all come out to zero in a given row of theadmix_proportions
matrix (foradmix_prop_1d_circular
infinite values also arise), which used to lead to NAs upon row normalization; now for those rows, the closest ancestry (by coordinate distance) gets assigned the full admixture fraction (just as for independent subpopulations/sigma = 0
).
bnpsd 1.3.2.9000 (2021-04-22)
- Updated various functions to transfer names between inputs and outputs as it makes sense
- Functions
admix_prop_1d_linear
,admix_prop_1d_circular
now copy names from the inputcoanc_subpops
(vector and matrix versions, only required when fittingbias_coeff
) to the columns of the outputadmix_proportions
matrix. - Function
draw_genotypes_admix
now copies row and column names from input matrixp_ind
(or rownames fromp_ind
and column names from the rownames ofadmix_proportions
when the latter is provided) to output genotype matrix - Function
draw_p_subpops
now copies names fromp_anc
to rows, names frominbr_subpops
to columns, when present and of the right dimensions. - Function
draw_p_subpops_tree
now copies names fromp_anc
to rows. Names fromtree_subpops
were already copied to columns before. - All other functions already transferred names as desired/appropriate. Tests were added for these functions to ensure that this is so.
- Functions
- Updated various functions to stop if there are paired names for two objects that are both non-NULL and disagree, as this suggests that the data is misaligned or incompatible.
- Functions
coanc_admix
andfst_admix
stop if the column names ofadmix_proportions
and the names ofcoanc_subpops
disagree. - Function
draw_all_admix
stops if the column names ofadmix_proportions
and the names of eitherinbr_subpops
ortree_subpops
disagree. - Function
draw_genotypes_admix
, whenadmix_proportions
is passed, stops if the column names ofadmix_proportions
andp_ind
disagree. - Function
make_p_ind_admix
stops if the column names ofadmix_proportions
andp_subpops
disagree.
- Functions
- Function
tree_additive
now has optionforce
, which whenTRUE
simply proceeds without stopping if additive edges were already present (intree$edge.length.add
, which is ignored and overwritten).
bnpsd 1.3.3.9000 (2021-04-29)
New functions and bug fixes dealing with reordering tree edges and tips.
- Added function
tree_reindex_tips
for ensuring that tip order agrees in both the internal labels vector and the edge matrix.
Such lack of agreement is generally possible (technically the tree is the same for arbitrary orders of edges in the edge matrix).
However, such a disagreement causes visual disagreement in plots (for example, trees are plotted in the order of the edge matrix, versus coancestry matrices are ordered as in the tip labels vector instead), which can now be fixed in general. - Added function
tree_reorder
for reordering tree edges and tips to agree as much as possible with a desired tip order.
The heuristic finds the exact solution if it exists, otherwise returns a reasonable order close to the desired order.
Tip order in labels and edge matrix agree (viatree_reindex_tips
). - Function
fit_tree
now outputs trees with tip order that better agrees with the input data, and tip order in labels vector and edge matrix now agree (viatree_reorder
). - Several functions now work with trees whose edges are arbitrarily ordered, particularly when they do not move out from the root (i.e. reverse postorder):
- Function
tree_additive
.
Before this bug fix, some trees could trigger the error message "Error: Node index 6 was not assigned coancestry from root! (unexpected)", where "6" could be other numbers. - Function
draw_p_subpops_tree
.
Before this bug fix, some trees could trigger the error message "Error: The root node index intree_subpops$edge
(9) does not matchk_subpops + 1
(6) wherek_subpops
is the number of tips! Is thetree_subpops
object malformed?", where "9" and "6" could be other numbers. Other possible error messages contain "Parent node index 6 has not yet been processed ..." or "Child" instead of "Parent", where "6" could be other numbers. - Internal functions used by
fit_tree
had related fixes, but overallfit_tree
appears to have had no bugs because users cannot provide trees, and the tree-building algorithm does not produce scrambled edges that would have caused problems.
- Function
bnpsd 1.3.4.9000 (2021-05-12)
- Functions
fixed_loci
anddraw_all_admix
have a new parametermaf_min
that, when greater than zero, allows for treating rare variants as fixed.
Indraw_all_admix
, this now allows for simulating loci with frequency-based ascertainment bias.
bnpsd 1.3.5.9000 (2021-05-14)
- Fixed a rare bug in
draw_all_admix
that could cause a "stack overflow" error.
The function used to call itself recursively ifrequire_polymorphic_loci = TRUE
, and in cases where there are very rare allele frequencies or highmaf_min
the number of recursions could be so large that it triggered this error.
Now the function has awhile
loop, and does not recurse more than one level at the time; there is no limit to the number of iterations and no errors occur inherently due to large numbers of iterations.
bnpsd 1.3.6.9000 (2021-06-02)
- Function
fit_tree
internally simplified to usestats::hclust
, which also results in a small runtime gain.
The new code (whenmethod = "mcquitty"
, which is default) gives the same answers as before (in other words, the original algorithm was a special case of hierarchical clustering).- New option
method
is passed tohclust
.
Although allhclust
methods are allowed, for this application the only ones that make sense are "mcquitty" (WPGMA) and "average" (UPGMA).
In internal evaluations, both algorithms had similar accuracy and runtime, but only "mcquitty" exactly recapitulates the original algorithm.
- New option
bnpsd 1.3.7.9000 (2021-06-04)
- Updated citations in
inst/CITATION
(missed last time I updated them in other locations).
bnpsd 1.3.8.9000 (2021-06-21)
- Added function
undiff_af
for creating "undifferentiated" allele frequency distributions based on real data but with a lower variance (more concentrated around 0.5) according to a given FST, useful for simulating data trying to match real data. - Added
LICENSE.md
. - Reformatted this
NEWS.md
slightly to improve its automatic parsing.
bnpsd 1.3.9.9000 (2021-06-22)
- Function
undiff_af
:- Added several useful informative statistics to return list:
F_max
,V_in
,V_out
,V_mix
, andalpha
. - Debugged
distr = "auto"
cases where mixing variance ended up being smaller than required due to roundoff errors (alpha
is now larger than given in direct formula byeps = 10 * .Machine$double.eps
, which is also a new option.
- Added several useful informative statistics to return list:
bnpsd 1.3.10.9000 (2021-06-22)
- Function
draw_all_admix
added optionp_anc_distr
for passing custom ancestral allele frequency distributions (as vector or function).
This differs from the similar preexisting optionp_anc
, which fixed ancestral allele frequencies per locus to those values.
These two options behave differently when loci have to be re-drawn due to being fixed or having too-low MAFs: passingp_anc
never changes those values, whereas passingp_anc_distr
results in drawing new values as necessary.
The new option is more natural biologically and results in re-drawing fixed loci less often.
bnpsd 1.3.11.9000 (2021-07-01)
- Function
undiff_af
renamed parameterF
tokinship_mean
, and updated all documentation to reflect the correction that this parameter is the mean kinship and not FST (the complete derivation will appear in a manuscript).- One element in the return list previously called
F_max
is similarly nowkinship_mean_max
.
- One element in the return list previously called
bnpsd 1.3.12 (2021-08-02)
- 6th CRAN submission.
- Removed "LazyData: true" from DESCRIPTION (to avoid a new "NOTE" on CRAN).
- Fixed spelling in documentation.
bnpsd 1.3.13 (2021-08-09)
- 6th CRAN submission, second attempt.
- Debugged internal code (
bias_coeff_admix_fit
) shared byadmix_prop_1d_linear
andadmix_prop_1d_circular
for edge cases.- Error was only observed on M1mac architecture (previous code worked on all other systems!).
- If a bias coefficient of 1 was desired, expected sigma to be
Inf
, but instead an error was encountered. - Previous error message: "f() values at end points not of opposite sign" (in
stats::uniroot
)