Skip to content

6th CRAN release: new tree intermediate subpopulations option, tree fitting, broad row/col name transfers, bug fixes

Latest
Compare
Choose a tag to compare
@alexviiia alexviiia released this 25 Aug 14:56

bnpsd 1.2.3.9000 (2021-02-16)

  • Documentation updates:
    • Fixed links to functions, in many cases these were broken because of incompatible mixed Rd and markdown syntax (now markdown is used more fully).

bnpsd 1.3.0.9000 (2021-03-24)

  • Added support for intermediate subpopulations related by a tree
    • New function draw_p_subpops_tree is the tree version of draw_p_subpops.
    • New function coanc_tree calculates the true coancestry matrix corresponding to the subpopulations related by a tree.
    • Function draw_all_admix has new argument tree_subpops that can be used in place of inbr_subpops (to simulated subpopulation allele frequencies using draw_p_subpops_tree instead of draw_p_subpops).
    • Note: These other functions work for trees (without change) because they accept arbitrary coancestry matrices (param coanc_subpops) as input, so they work if they are passed the matrix that coanc_tree returns: coanc_admix, fst_admix, admix_prop_1d_linear, admix_prop_1d_circular.
  • Functions admix_prop_1d_linear and admix_prop_1d_circular, when sigma is missing (and therefore fit to a desired coanc_subpops, fst, and bias_coeff), now additionally return multiplicative factor used to rescale coanc_subpops.

bnpsd 1.3.1.9000 (2021-04-17)

It's Fangorn Forest around here with all the tree updates!

  • Added these functions:
    • fit_tree for fitting trees to coancestry matrices!
    • scale_tree to easily scale coancestry trees and check for out-of-bounds values.
    • tree_additive for calculating "additive" edges for probabilistic edge coancestry trees, and also the reverse function .
      • This already existed as an internal, unexported function used mainly by coanc_tree, but now it's renamed, exported, and well documented!
  • Added support of $root.edge to tree phylo objects passed to these functions:
    • coanc_tree: edge is a shared covariance value affecting all subpopulations.
    • draw_all_admix and draw_p_subpops_tree: if root edge is present, functions warn that it will be ignored.
  • Functions admix_prop_1d_linear and admix_prop_1d_circular: debugged an edge case where sigma is small but not zero and numerically-calculated densities all come out to zero in a given row of the admix_proportions matrix (for admix_prop_1d_circular infinite values also arise), which used to lead to NAs upon row normalization; now for those rows, the closest ancestry (by coordinate distance) gets assigned the full admixture fraction (just as for independent subpopulations/sigma = 0).

bnpsd 1.3.2.9000 (2021-04-22)

  • Updated various functions to transfer names between inputs and outputs as it makes sense
    • Functions admix_prop_1d_linear, admix_prop_1d_circular now copy names from the input coanc_subpops (vector and matrix versions, only required when fitting bias_coeff) to the columns of the output admix_proportions matrix.
    • Function draw_genotypes_admix now copies row and column names from input matrix p_ind (or rownames from p_ind and column names from the rownames of admix_proportions when the latter is provided) to output genotype matrix
    • Function draw_p_subpops now copies names from p_anc to rows, names from inbr_subpops to columns, when present and of the right dimensions.
    • Function draw_p_subpops_tree now copies names from p_anc to rows. Names from tree_subpops were already copied to columns before.
    • All other functions already transferred names as desired/appropriate. Tests were added for these functions to ensure that this is so.
  • Updated various functions to stop if there are paired names for two objects that are both non-NULL and disagree, as this suggests that the data is misaligned or incompatible.
    • Functions coanc_admix and fst_admix stop if the column names of admix_proportions and the names of coanc_subpops disagree.
    • Function draw_all_admix stops if the column names of admix_proportions and the names of either inbr_subpops or tree_subpops disagree.
    • Function draw_genotypes_admix, when admix_proportions is passed, stops if the column names of admix_proportions and p_ind disagree.
    • Function make_p_ind_admix stops if the column names of admix_proportions and p_subpops disagree.
  • Function tree_additive now has option force, which when TRUE simply proceeds without stopping if additive edges were already present (in tree$edge.length.add, which is ignored and overwritten).

bnpsd 1.3.3.9000 (2021-04-29)

New functions and bug fixes dealing with reordering tree edges and tips.

  • Added function tree_reindex_tips for ensuring that tip order agrees in both the internal labels vector and the edge matrix.
    Such lack of agreement is generally possible (technically the tree is the same for arbitrary orders of edges in the edge matrix).
    However, such a disagreement causes visual disagreement in plots (for example, trees are plotted in the order of the edge matrix, versus coancestry matrices are ordered as in the tip labels vector instead), which can now be fixed in general.
  • Added function tree_reorder for reordering tree edges and tips to agree as much as possible with a desired tip order.
    The heuristic finds the exact solution if it exists, otherwise returns a reasonable order close to the desired order.
    Tip order in labels and edge matrix agree (via tree_reindex_tips).
  • Function fit_tree now outputs trees with tip order that better agrees with the input data, and tip order in labels vector and edge matrix now agree (via tree_reorder).
  • Several functions now work with trees whose edges are arbitrarily ordered, particularly when they do not move out from the root (i.e. reverse postorder):
    • Function tree_additive.
      Before this bug fix, some trees could trigger the error message "Error: Node index 6 was not assigned coancestry from root! (unexpected)", where "6" could be other numbers.
    • Function draw_p_subpops_tree.
      Before this bug fix, some trees could trigger the error message "Error: The root node index in tree_subpops$edge (9) does not match k_subpops + 1 (6) where k_subpops is the number of tips! Is the tree_subpops object malformed?", where "9" and "6" could be other numbers. Other possible error messages contain "Parent node index 6 has not yet been processed ..." or "Child" instead of "Parent", where "6" could be other numbers.
    • Internal functions used by fit_tree had related fixes, but overall fit_tree appears to have had no bugs because users cannot provide trees, and the tree-building algorithm does not produce scrambled edges that would have caused problems.

bnpsd 1.3.4.9000 (2021-05-12)

  • Functions fixed_loci and draw_all_admix have a new parameter maf_min that, when greater than zero, allows for treating rare variants as fixed.
    In draw_all_admix, this now allows for simulating loci with frequency-based ascertainment bias.

bnpsd 1.3.5.9000 (2021-05-14)

  • Fixed a rare bug in draw_all_admix that could cause a "stack overflow" error.
    The function used to call itself recursively if require_polymorphic_loci = TRUE, and in cases where there are very rare allele frequencies or high maf_min the number of recursions could be so large that it triggered this error.
    Now the function has a while loop, and does not recurse more than one level at the time; there is no limit to the number of iterations and no errors occur inherently due to large numbers of iterations.

bnpsd 1.3.6.9000 (2021-06-02)

  • Function fit_tree internally simplified to use stats::hclust, which also results in a small runtime gain.
    The new code (when method = "mcquitty", which is default) gives the same answers as before (in other words, the original algorithm was a special case of hierarchical clustering).
    • New option method is passed to hclust.
      Although all hclust methods are allowed, for this application the only ones that make sense are "mcquitty" (WPGMA) and "average" (UPGMA).
      In internal evaluations, both algorithms had similar accuracy and runtime, but only "mcquitty" exactly recapitulates the original algorithm.

bnpsd 1.3.7.9000 (2021-06-04)

  • Updated citations in inst/CITATION (missed last time I updated them in other locations).

bnpsd 1.3.8.9000 (2021-06-21)

  • Added function undiff_af for creating "undifferentiated" allele frequency distributions based on real data but with a lower variance (more concentrated around 0.5) according to a given FST, useful for simulating data trying to match real data.
  • Added LICENSE.md.
  • Reformatted this NEWS.md slightly to improve its automatic parsing.

bnpsd 1.3.9.9000 (2021-06-22)

  • Function undiff_af:
    • Added several useful informative statistics to return list: F_max, V_in, V_out, V_mix, and alpha.
    • Debugged distr = "auto" cases where mixing variance ended up being smaller than required due to roundoff errors (alpha is now larger than given in direct formula by eps = 10 * .Machine$double.eps, which is also a new option.

bnpsd 1.3.10.9000 (2021-06-22)

  • Function draw_all_admix added option p_anc_distr for passing custom ancestral allele frequency distributions (as vector or function).
    This differs from the similar preexisting option p_anc, which fixed ancestral allele frequencies per locus to those values.
    These two options behave differently when loci have to be re-drawn due to being fixed or having too-low MAFs: passing p_anc never changes those values, whereas passing p_anc_distr results in drawing new values as necessary.
    The new option is more natural biologically and results in re-drawing fixed loci less often.

bnpsd 1.3.11.9000 (2021-07-01)

  • Function undiff_af renamed parameter F to kinship_mean, and updated all documentation to reflect the correction that this parameter is the mean kinship and not FST (the complete derivation will appear in a manuscript).
    • One element in the return list previously called F_max is similarly now kinship_mean_max.

bnpsd 1.3.12 (2021-08-02)

  • 6th CRAN submission.
  • Removed "LazyData: true" from DESCRIPTION (to avoid a new "NOTE" on CRAN).
  • Fixed spelling in documentation.

bnpsd 1.3.13 (2021-08-09)

  • 6th CRAN submission, second attempt.
  • Debugged internal code (bias_coeff_admix_fit) shared by admix_prop_1d_linear and admix_prop_1d_circular for edge cases.
    • Error was only observed on M1mac architecture (previous code worked on all other systems!).
    • If a bias coefficient of 1 was desired, expected sigma to be Inf, but instead an error was encountered.
    • Previous error message: "f() values at end points not of opposite sign" (in stats::uniroot)