diff --git a/.vale.ini b/.vale.ini index 054a57d418..0aab5a6f57 100644 --- a/.vale.ini +++ b/.vale.ini @@ -12,13 +12,7 @@ jl = md [docs/src/*.md] BasedOnStyles = Vale, Google -[docs/src/contributing.md] -BasedOnStyles = Vale, Google -Google.Will = false ; given format and really with intend a _will_ -Google.Headings = false ; some might jeally ahabe [] in their headers -Google.FirstPerson = false ; we pose a few contribution points as first-person questions - -[Changelog.md, CONTRIBUTING.md] +[{docs/src/contributing.md, Changelog.md, CONTRIBUTING.md}] BasedOnStyles = Vale, Google Google.Will = false ; given format and really with intend a _will_ Google.Headings = false ; some might jeally ahabe [] in their headers @@ -49,8 +43,9 @@ BasedOnStyles = Vale, Google TokenIgnores = (\$+[^\n$]+\$+) Google.We = false # For tutorials we want to address the user directly. -[docs/src/tutorials/*.md] ; actually .qmd for the first, second autogenerated +[docs/src/tutorials/*.md] ; Can I somehow just deactivate these? BasedOnStyles = Vale, Google ; ignore (1) math (2) ref and cite keys (3) code in docs (4) math in docs (5,6) indented blocks TokenIgnores = (\$+[^\n$]+\$+) -Google.We = false # For tutorials we want to address the user directly. \ No newline at end of file +Google.We = false # For tutorials we want to address the user directly. +Google.Spacing = false # one reference uses this diff --git a/Changelog.md b/Changelog.md index d0dce06686..ca64ce9326 100644 --- a/Changelog.md +++ b/Changelog.md @@ -1,10 +1,17 @@ # Changelog -All notable Changes to the Julia package `Manopt.jl` will be documented in this file. The file was started with Version `0.4`. +All notable Changes to the Julia package `Manopt.jl` are documented in this file. +The file was started with Version `0.4`. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.5.7] February 17, 20265 + +### Added + +* A mesh adaptive direct search algorithm (MADS), for now with the LTMADS variant using a lower triangular random matrix in the poll step. + ## [0.5.6] February 10, 2025 ### Changed @@ -29,16 +36,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 * The geodesic regression example, first because it is not correct, second because it should become part of ManoptExamples.jl once it is correct. -## [0.5.4] - December 11, 2024 +## [0.5.4] December 11, 2024 ### Added * An automated detection whether the tutorials are present if not an also no quarto run is done, an automated `--exclude-tutorials` option is added. * Support for ManifoldDiff 0.4 -* icons upfront external links when they link to another package or wikipedia. +* icons upfront external links when they link to another package or Wikipedia. -## [0.5.3] – October 18, 2024 +## [0.5.3] October 18, 2024 ### Added @@ -48,9 +55,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 * stabilize `max_stepsize` to also work when `injectivity_radius` dos not exist. It however would warn new users, that activate tutorial mode. -* Start a `ManoptTestSuite` subpackage to store dummy types and common test helpers in. +* Start a `ManoptTestSuite` sub package to store dummy types and common test helpers in. -## [0.5.2] – October 5, 2024 +## [0.5.2] October 5, 2024 ### Added @@ -61,7 +68,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 * fix a few typos in the documentation * improved the documentation for the initial guess of [`ArmijoLinesearchStepsize`](https://manoptjl.org/stable/plans/stepsize/#Manopt.ArmijoLinesearch). -## [0.5.1] – September 4, 2024 +## [0.5.1] September 4, 2024 ### Changed @@ -71,17 +78,17 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 * the `proximal_point` method. -## [0.5.0] – August 29, 2024 +## [0.5.0] August 29, 2024 This breaking update is mainly concerned with improving a unified experience through all solvers and some usability improvements, such that for example the different gradient update rules are easier to specify. -In general we introduce a few factories, that avoid having to pass the manifold to keyword arguments +In general this introduces a few factories, that avoid having to pass the manifold to keyword arguments ### Added * A `ManifoldDefaultsFactory` that postpones the creation/allocation of manifold-specific fields in for example direction updates, step sizes and stopping criteria. As a rule of thumb, internal structures, like a solver state should store the final type. Any high-level interface, like the functions to start solvers, should accept such a factory in the appropriate places and call the internal `_produce_type(factory, M)`, for example before passing something to the state. -* a `documentation_glossary.jl` file containing a glossary of often used variables in fields, arguments, and keywords, to print them in a unified manner. The same for usual sections, tex, and math notation that is often used within the doc-strings. +* a `documentation_glossary.jl` file containing a glossary of often used variables in fields, arguments, and keywords, to print them in a unified manner. The same for usual sections, text, and math notation that is often used within the doc-strings. ### Changed @@ -106,12 +113,12 @@ In general we introduce a few factories, that avoid having to pass the manifold * `HestenesStiefelCoefficient` is now called `HestenesStiefelCoefficientRule`. For the `HestenesStiefelCoefficient` the manifold as its first parameter is no longer necessary and the vector transport has been unified/moved to the `vector_transport_method=` keyword. * `LiuStoreyCoefficient` is now called `LiuStoreyCoefficientRule`. For the `LiuStoreyCoefficient` the manifold as its first parameter is no longer necessary and the vector transport has been unified/moved to the `vector_transport_method=` keyword. * `PolakRibiereCoefficient` is now called `PolakRibiereCoefficientRule`. For the `PolakRibiereCoefficient` the manifold as its first parameter is no longer necessary and the vector transport has been unified/moved to the `vector_transport_method=` keyword. - * the `SteepestDirectionUpdateRule` is now called `SteepestDescentCoefficientRule`. The `SteepestDescentCoefficient` is equivalent, but creates the new factory interims wise. + * the `SteepestDirectionUpdateRule` is now called `SteepestDescentCoefficientRule`. The `SteepestDescentCoefficient` is equivalent, but creates the new factory temporarily. * `AbstractGradientGroupProcessor` is now called `AbstractGradientGroupDirectionRule` - * the `StochasticGradient` is now called `StochasticGradientRule`. The `StochasticGradient` is equivalent, but creates the new factory interims wise, so that the manifold is not longer necessary. + * the `StochasticGradient` is now called `StochasticGradientRule`. The `StochasticGradient` is equivalent, but creates the new factory temporarily, so that the manifold is not longer necessary. * the `AlternatingGradient` is now called `AlternatingGradientRule`. - The `AlternatingGradient` is equivalent, but creates the new factory interims wise, so that the manifold is not longer necessary. -* `quasi_Newton` had a keyword `scale_initial_operator=` that was inconsistently declared (sometimes bool, sometimes real) and was unused. + The `AlternatingGradient` is equivalent, but creates the new factory temporarily, so that the manifold is not longer necessary. +* `quasi_Newton` had a keyword `scale_initial_operator=` that was inconsistently declared (sometimes boolean, sometimes real) and was unused. It is now called `initial_scale=1.0` and scales the initial (diagonal, unit) matrix within the approximation of the Hessian additionally to the $\frac{1}{\lVert g_k\rVert}$ scaling with the norm of the oldest gradient for the limited memory variant. For the full matrix variant the initial identity matrix is now scaled with this parameter. * Unify doc strings and presentation of keyword arguments * general indexing, for example in a vector, uses `i` @@ -144,14 +151,14 @@ In general we introduce a few factories, that avoid having to pass the manifold * `AdaptiveRegularizationState(M, sub_problem [, sub_state]; kwargs...)` replaces the (anyways unused) variant to only provide the objective; both `X` and `p` moved to keyword arguments. * `AugmentedLagrangianMethodState(M, objective, sub_problem; evaluation=...)` was added - * ``AugmentedLagrangianMethodState(M, objective, sub_problem, sub_state; evaluation=...)` now has `p=rand(M)` as keyword argument instead of being the second positional one + * `AugmentedLagrangianMethodState(M, objective, sub_problem, sub_state; evaluation=...)` now has `p=rand(M)` as keyword argument instead of being the second positional one * `ExactPenaltyMethodState(M, sub_problem; evaluation=...)` was added and `ExactPenaltyMethodState(M, sub_problem, sub_state; evaluation=...)` now has `p=rand(M)` as keyword argument instead of being the second positional one * `DifferenceOfConvexState(M, sub_problem; evaluation=...)` was added and `DifferenceOfConvexState(M, sub_problem, sub_state; evaluation=...)` now has `p=rand(M)` as keyword argument instead of being the second positional one * `DifferenceOfConvexProximalState(M, sub_problem; evaluation=...)` was added and `DifferenceOfConvexProximalState(M, sub_problem, sub_state; evaluation=...)` now has `p=rand(M)` as keyword argument instead of being the second positional one * bumped `Manifolds.jl`to version 0.10; this mainly means that any algorithm working on a product manifold and requiring `ArrayPartition` now has to explicitly do `using RecursiveArrayTools`. ### Fixed -* the `AverageGradientRule` filled its internal vector of gradients wrongly – or mixed it up in parallel transport. This is now fixed. +* the `AverageGradientRule` filled its internal vector of gradients wrongly or mixed it up in parallel transport. This is now fixed. ### Removed @@ -171,13 +178,13 @@ In general we introduce a few factories, that avoid having to pass the manifold * to update a stopping criterion in a solver state, replace the old `update_stopping_criterion!(state, :Val, v)` tat passed down to the stopping criterion by the explicit pass down with `set_parameter!(state, :StoppingCriterion, :Val, v)` -## [0.4.69] – August 3, 2024 +## [0.4.69] August 3, 2024 ### Changed * Improved performance of Interior Point Newton Method. -## [0.4.68] – August 2, 2024 +## [0.4.68] August 2, 2024 ### Added @@ -185,9 +192,9 @@ In general we introduce a few factories, that avoid having to pass the manifold * a `conjugate_residual` Algorithm to solve a linear system on a tangent space. * `ArmijoLinesearch` now allows for additional `additional_decrease_condition` and `additional_increase_condition` keywords to add further conditions to accept additional conditions when to accept an decreasing or increase of the stepsize. * add a `DebugFeasibility` to have a debug print about feasibility of points in constrained optimisation employing the new `is_feasible` function -* add a `InteriorPointCentralityCondition` check that can be added for step candidates within the line search of `interior_point_newton` +* add a `InteriorPointCentralityCondition` that can be added for step candidates within the line search of `interior_point_newton` * Add Several new functors - * the `LagrangianCost`, `LagrangianGradient`, `LagrangianHessian`, that based on a constrained objective allow to construct the hessian objective of its Lagrangian + * the `LagrangianCost`, `LagrangianGradient`, `LagrangianHessian`, that based on a constrained objective allow to construct the Hessian objective of its Lagrangian * the `CondensedKKTVectorField` and its `CondensedKKTVectorFieldJacobian`, that are being used to solve a linear system within `interior_point_newton` * the `KKTVectorField` as well as its `KKTVectorFieldJacobian` and ``KKTVectorFieldAdjointJacobian` * the `KKTVectorFieldNormSq` and its `KKTVectorFieldNormSqGradient` used within the Armijo line search of `interior_point_newton` @@ -195,7 +202,7 @@ In general we introduce a few factories, that avoid having to pass the manifold * A `StopWhenRelativeResidualLess` for the `conjugate_residual` * A `StopWhenKKTResidualLess` for the `interior_point_newton` -## [0.4.67] – July 25, 2024 +## [0.4.67] July 25, 2024 ### Added @@ -241,7 +248,7 @@ In general we introduce a few factories, that avoid having to pass the manifold * Remodel `ConstrainedManifoldObjective` to store an `AbstractManifoldObjective` internally instead of directly `f` and `grad_f`, allowing also Hessian objectives therein and implementing access to this Hessian -* Fixed a bug that Lanczos produced NaNs when started exactly in a minimizer, since we divide by the gradient norm. +* Fixed a bug that Lanczos produced NaNs when started exactly in a minimizer, since the algorithm initially divides by the gradient norm. ### Deprecated diff --git a/Project.toml b/Project.toml index 979110531f..cdffee4927 100644 --- a/Project.toml +++ b/Project.toml @@ -1,7 +1,7 @@ name = "Manopt" uuid = "0fc0a36d-df90-57f3-8f93-d78a9fc72bb5" authors = ["Ronny Bergmann "] -version = "0.5.6" +version = "0.5.7" [deps] ColorSchemes = "35d6a980-a343-548e-a6ea-1d62b119f2f4" diff --git a/docs/make.jl b/docs/make.jl index d174383e1d..f34c70e5e2 100755 --- a/docs/make.jl +++ b/docs/make.jl @@ -201,6 +201,7 @@ makedocs(; "Gradient Descent" => "solvers/gradient_descent.md", "Interior Point Newton" => "solvers/interior_point_Newton.md", "Levenberg–Marquardt" => "solvers/LevenbergMarquardt.md", + "MADS" => "solvers/mesh_adaptive_direct_search.md", "Nelder–Mead" => "solvers/NelderMead.md", "Particle Swarm Optimization" => "solvers/particle_swarm.md", "Primal-dual Riemannian semismooth Newton" => "solvers/primal_dual_semismooth_Newton.md", diff --git a/docs/src/about.md b/docs/src/about.md index b0b1085c88..57f8968ee0 100644 --- a/docs/src/about.md +++ b/docs/src/about.md @@ -13,6 +13,7 @@ Thanks to the following contributors to `Manopt.jl`: * [Hajg Jasa](https://www.ntnu.edu/employees/hajg.jasa) implemented the [convex bundle method](solvers/convex_bundle_method.md) and the [proximal bundle method](solvers/proximal_bundle_method.md) and a default subsolver each of them. * Even Stephansen Kjemsås contributed to the implementation of the [Frank Wolfe Method](solvers/FrankWolfe.md) solver. * Mathias Ravn Munkvold contributed most of the implementation of the [Adaptive Regularization with Cubics](solvers/adaptive-regularization-with-cubics.md) solver as well as its [Lanczos](@ref arc-Lanczos) subsolver +* [Sander Engen Oddsen](https://github.com/oddsen) contributed to the implementation of the [LTMADS](solvers/mesh_adaptive_direct_search.md) solver. * [Tom-Christian Riemer](https://www.tu-chemnitz.de/mathematik/wire/mitarbeiter.php) implemented the [trust regions](solvers/trust_regions.md) and [quasi Newton](solvers/quasi_Newton.md) solvers as well as the [truncated conjugate gradient descent](solvers/truncated_conjugate_gradient_descent.md) subsolver. * [Markus A. Stokkenes](https://www.linkedin.com/in/markus-a-stokkenes-b41bba17b/) contributed most of the implementation of the [Interior Point Newton Method](solvers/interior_point_Newton.md) as well as its default [Conjugate Residual](solvers/conjugate_residual.md) subsolver * [Manuel Weiss](https://scoop.iwr.uni-heidelberg.de/author/manuel-weiß/) implemented most of the [conjugate gradient update rules](@ref cg-coeffs) diff --git a/docs/src/references.bib b/docs/src/references.bib index c7294640ea..0f1825ab52 100644 --- a/docs/src/references.bib +++ b/docs/src/references.bib @@ -325,6 +325,13 @@ @article{DiepeveenLellmann:2021 VOLUME = {14}, YEAR = {2021}, } +@techreport{Dreisigmeyer:2007, + AUTHOR = {Dreisigmeyer, David W.}, + INSTITUTION = {Optimization Online}, + TITLE = {Direct Search Alogirthms over Riemannian Manifolds}, + URL = {https://optimization-online.org/?p=9134}, + YEAR = {2007} +} @article{DuranMoelleSbertCremers:2016, AUTHOR = {Duran, J. and Moeller, M. and Sbert, C. and Cremers, D.}, TITLE = {Collaborative Total Variation: A General Framework for Vectorial TV Models}, diff --git a/docs/src/solvers/index.md b/docs/src/solvers/index.md index c94649bfad..6509e739e9 100644 --- a/docs/src/solvers/index.md +++ b/docs/src/solvers/index.md @@ -22,6 +22,7 @@ For derivative free only function evaluations of ``f`` are used. * [Nelder-Mead](NelderMead.md) a simplex based variant, that is using ``d+1`` points, where ``d`` is the dimension of the manifold. * [Particle Swarm](particle_swarm.md) 🫏 use the evolution of a set of points, called swarm, to explore the domain of the cost and find a minimizer. +* [Mesh adaptive direct search](mesh_adaptive_direct_search.md) performs a mesh based exploration (poll) and search. * [CMA-ES](cma_es.md) uses a stochastic evolutionary strategy to perform minimization robust to local minima of the objective. ## First order diff --git a/docs/src/solvers/mesh_adaptive_direct_search.md b/docs/src/solvers/mesh_adaptive_direct_search.md new file mode 100644 index 0000000000..00e870b168 --- /dev/null +++ b/docs/src/solvers/mesh_adaptive_direct_search.md @@ -0,0 +1,69 @@ +# Mesh adaptive direct search (MADS) + + +```@meta +CurrentModule = Manopt +``` + +```@docs + mesh_adaptive_direct_search + mesh_adaptive_direct_search! +``` + +## State + +```@docs + MeshAdaptiveDirectSearchState +``` + +## Poll + +```@docs + AbstractMeshPollFunction + LowerTriangularAdaptivePoll +``` + +as well as the internal functions + +```@docs +Manopt.get_descent_direction(::LowerTriangularAdaptivePoll) +Manopt.is_successful(::LowerTriangularAdaptivePoll) +Manopt.get_candidate(::LowerTriangularAdaptivePoll) +Manopt.get_basepoint(::LowerTriangularAdaptivePoll) +Manopt.update_basepoint!(M, ltap::LowerTriangularAdaptivePoll{P}, p::P) where {P} +``` + +## Search + +```@docs + AbstractMeshSearchFunction + DefaultMeshAdaptiveDirectSearch +``` + +as well as the internal functions + +```@docs +Manopt.is_successful(::DefaultMeshAdaptiveDirectSearch) +Manopt.get_candidate(::DefaultMeshAdaptiveDirectSearch) +``` + +## Additional stopping criteria + +```@docs + StopWhenPollSizeLess +``` + +## Technical details + +The [`mesh_adaptive_direct_search`](@ref) solver requires the following functions of a manifold to be available + +* A [`retract!`](@extref ManifoldsBase :doc:`retractions`)`(M, q, p, X)`; it is recommended to set the [`default_retraction_method`](@extref `ManifoldsBase.default_retraction_method-Tuple{AbstractManifold}`) to a favourite retraction. If this default is set, a `retraction_method=` does not have to be specified. +* Within the default initialization [`rand`](@extref Base.rand-Tuple{AbstractManifold})`(M)` is used to generate the initial population +* A [`vector_transport_to!`](@extref ManifoldsBase :doc:`vector_transports`)`M, Y, p, X, q)`; it is recommended to set the [`default_vector_transport_method`](@extref `ManifoldsBase.default_vector_transport_method-Tuple{AbstractManifold}`) to a favourite retraction. If this default is set, a `vector_transport_method=` does not have to be specified. + +## Literature + +```@bibliography +Pages = ["mesh_adaptive_direct_search.md"] +Canonical=false +``` \ No newline at end of file diff --git a/docs/src/tutorials/InplaceGradient.md b/docs/src/tutorials/InplaceGradient.md index a575ba3e82..4ca31c3f7c 100644 --- a/docs/src/tutorials/InplaceGradient.md +++ b/docs/src/tutorials/InplaceGradient.md @@ -63,7 +63,7 @@ We can also benchmark this as Time (median): 55.145 ms ┊ GC (median): 9.99% Time (mean ± σ): 56.391 ms ± 6.102 ms ┊ GC (mean ± σ): 9.92% ± 1.43% - ▅██▅▃▁ + ▅██▅▃▁ ▅███████▁▅▇▅▁▅▁▁▅▅▁▁▁▅▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ ▁ 53 ms Histogram: log(frequency) by time 81.7 ms < @@ -121,7 +121,7 @@ We can again benchmark this Time (median): 37.559 ms ┊ GC (median): 0.00% Time (mean ± σ): 38.658 ms ± 3.904 ms ┊ GC (mean ± σ): 0.73% ± 2.68% - ██▅▅▄▂▁ ▂ + ██▅▅▄▂▁ ▂ ███████▁██▁▅▁▁▁▅▁▁▁▁▅▅▅▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▁▅ ▅ 36.6 ms Histogram: log(frequency) by time 61 ms < diff --git a/docs/styles/config/vocabularies/Manopt/accept.txt b/docs/styles/config/vocabularies/Manopt/accept.txt index 73e5f1fef9..2ebb074b6b 100644 --- a/docs/styles/config/vocabularies/Manopt/accept.txt +++ b/docs/styles/config/vocabularies/Manopt/accept.txt @@ -34,6 +34,7 @@ eigen eigendecomposition elementwise Ehresmann +Engen Fenchel Ferreira Frank @@ -82,6 +83,7 @@ Newton nonmonotone nonpositive [Nn]onsmooth +Oddsen [Pp]arametrising Parametrising [Pp]ock diff --git a/src/Manopt.jl b/src/Manopt.jl index e8d5e3a5b2..fc7079d94f 100644 --- a/src/Manopt.jl +++ b/src/Manopt.jl @@ -199,6 +199,7 @@ include("solvers/FrankWolfe.jl") include("solvers/gradient_descent.jl") include("solvers/interior_point_Newton.jl") include("solvers/LevenbergMarquardt.jl") +include("solvers/mesh_adaptive_direct_search.jl") include("solvers/particle_swarm.jl") include("solvers/primal_dual_semismooth_Newton.jl") include("solvers/proximal_bundle_method.jl") @@ -341,6 +342,7 @@ export AbstractGradientSolverState, InteriorPointNewtonState, LanczosState, LevenbergMarquardtState, + MeshAdaptiveDirectSearchState, NelderMeadState, ParticleSwarmState, PrimalDualSemismoothNewtonState, @@ -431,6 +433,8 @@ export WolfePowellLinesearch, WolfePowellBinaryLinesearch export AbstractStateAction, StoreStateAction export has_storage, get_storage, update_storage! export objective_cache_factory +export AbstractMeshPollFunction, LowerTriangularAdaptivePoll +export AbstractMeshSearchFunction, DefaultMeshAdaptiveDirectSearch # # Direction Update Rules export DirectionUpdateRule @@ -480,6 +484,8 @@ export adaptive_regularization_with_cubics, interior_point_Newton!, LevenbergMarquardt, LevenbergMarquardt!, + mesh_adaptive_direct_search, + mesh_adaptive_direct_search!, NelderMead, NelderMead!, particle_swarm, @@ -541,6 +547,7 @@ export StopAfter, StopWhenKKTResidualLess, StopWhenLagrangeMultiplierLess, StopWhenModelIncreased, + StopWhenPollSizeLess, StopWhenPopulationCostConcentrated, StopWhenPopulationConcentrated, StopWhenPopulationDiverges, diff --git a/src/plans/cache.jl b/src/plans/cache.jl index 098c0c2e8f..bde956a889 100644 --- a/src/plans/cache.jl +++ b/src/plans/cache.jl @@ -250,8 +250,6 @@ which function evaluations to cache. number of (least recently used) calls to cache * `cache_sizes=Dict{Symbol,Int}()`: a named tuple or dictionary specifying the sizes individually for each cache. - - """ struct ManifoldCachedObjective{E,P,O<:AbstractManifoldObjective{<:E},C<:NamedTuple{}} <: AbstractDecoratedManifoldObjective{E,P} diff --git a/src/plans/constrained_plan.jl b/src/plans/constrained_plan.jl index 875d12efbb..2bf002fb57 100644 --- a/src/plans/constrained_plan.jl +++ b/src/plans/constrained_plan.jl @@ -27,9 +27,9 @@ A common supertype for fucntors that model constraint functions with slack. This supertype additionally provides access for the fields * `μ::T` the dual for the inequality constraints -* `s::T` the slack parametyer, and +* `s::T` the slack parameter, and * `β::R` the the barrier parameter -which is also of typee `T`. +which is also of type `T`. """ abstract type AbstractConstrainedSlackFunctor{T,R} end diff --git a/src/plans/mesh_adaptive_plan.jl b/src/plans/mesh_adaptive_plan.jl new file mode 100644 index 0000000000..0d43d7e74a --- /dev/null +++ b/src/plans/mesh_adaptive_plan.jl @@ -0,0 +1,498 @@ +""" + AbstractMeshPollFunction + +An abstract type for common “poll” strategies in the [`mesh_adaptive_direct_search`](@ref) +solver. +A subtype of this The functor has to fulfil + +* be callable as `poll!(problem, mesh_size; kwargs...)` and modify the state + +as well as to provide functions + +* `is_successful(poll!)` that indicates whether the last poll was successful in finding a new candidate +* `get_basepoint(poll!)` that returns the base point at which the mesh is build +* `get_candidate(poll!)` that returns the last found candidate if the poll was successful. + Otherwise the base point is returned +* `get_descent_direction(poll!)` the the vector that points from the base point to the candidate. + If the last poll was not successful, the zero vector is returned +* `update_basepoint!(M, poll!, p)` that updates the base point to `p` and all necessary + internal data to a new point to build a mesh at + +The `kwargs...` could include + +* `scale_mesh=1.0`: to rescale the mesh globally +* `max_stepsize=Inf`: avoid exceeding a step size beyond this value, e.g. injectivity radius. + any vector longer than this should be shortened to the provided maximum step size. +""" +abstract type AbstractMeshPollFunction end + +""" + AbstractMeshSearchFunction + +Should be callable as `search!(problem, mesh_size, p, X; kwargs...)` + +where `X` is the last successful poll direction from the tangent space at `p` +if that exists and the zero vector otherwise. + + +Besides that the following functions should be implemented + +* `is_successful(search!)` that indicates whether the last search was successful in finding a new candidate +* `get_candidate(search!)` that returns the last found candidate +""" +abstract type AbstractMeshSearchFunction end + +""" + StopWhenPollSizeLess <: StoppingCriterion + +stores a threshold when to stop looking at the poll mesh size of an [`MeshAdaptiveDirectSearchState`](@ref). + +# Constructor + + StopWhenPollSizeLess(ε) + +initialize the stopping criterion to a threshold `ε`. +""" +mutable struct StopWhenPollSizeLess{F} <: StoppingCriterion + threshold::F + last_poll_size::F + at_iteration::Int + function StopWhenPollSizeLess(ε::F) where {F<:Real} + return new{F}(ε, zero(ε), -1) + end +end + +# +# Specific Polls +# +""" + LowerTriangularAdaptivePoll <: AbstractMeshPollFunction + +Generate a mesh (poll step) based on Section 6 and 7 of [Dreisigmeyer:2007](@cite), +with two small modifications: +* The mesh can be scaled globally so instead of ``Δ_0^m=1`` a certain different scale is used +* Any poll direction can be rescaled if it is too long. This is to not exceed the injectivity radius for example. + +# Functor + + (p::LowerTriangularAdaptivePoll)(problem, mesh_size; scale_mesh=1.0, max_stepsize=inf) + +# Fields + +* `base_point::P`: a point on the manifold, where the mesh is build in the tangent space +* `basis`: a basis of the current tangent space with respect to which the mesh is stored +* `candidate::P`: a memory for a new point/candidate +* `mesh`: a vector of tangent vectors storing the mesh. +* `random_vector`: a ``d``-dimensional random vector ``b_l``` +* `random_index`: a random index ``ι`` +$(_var(:Field, :retraction_method)) +$(_var(:Field, :vector_transport_method)) +* `X::T` the last successful poll direction stored as a tangent vector. + initialised to the zero vector and reset to the zero vector after moving to a new tangent space. + +# Constructor + + LowerTriangularAdaptivePoll(M, p=rand(M); kwargs...) + +## Keyword arguments + +* `basis=`[`DefaultOrthonormalBasis`](@extref `ManifoldsBase.DefaultOrthonormalBasis`) +$(_var(:Keyword, :retraction_method)) +$(_var(:Keyword, :vector_transport_method)) +$(_var(:Keyword, :X)) +""" +mutable struct LowerTriangularAdaptivePoll{ + P, + T, + F<:Real, + V<:AbstractVector{F}, + M<:AbstractMatrix{F}, + I<:Int, + B, + VTM<:AbstractVectorTransportMethod, + RM<:AbstractRetractionMethod, +} <: AbstractMeshPollFunction + base_point::P + candidate::P + poll_counter::I + random_vector::V + random_index::I + mesh::M + basis::B + X::T + last_poll_improved::Bool + retraction_method::RM + vector_transport_method::VTM +end + +function LowerTriangularAdaptivePoll( + M::AbstractManifold, + p=rand(M); + basis::AbstractBasis=DefaultOrthonormalBasis(), + retraction_method::AbstractRetractionMethod=default_retraction_method(M), + vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method( + M + ), + X=zero_vector(M, p), +) + d = manifold_dimension(M) + b_l = zeros(d) + D_k = zeros(d, d + 1) + return LowerTriangularAdaptivePoll( + p, + copy(M, p), + 0, + b_l, + 0, + D_k, + basis, + X, + false, + retraction_method, + vector_transport_method, + ) +end +""" + is_successful(ltap::LowerTriangularAdaptivePoll) + +Return whether the last [`LowerTriangularAdaptivePoll`](@ref) step was successful +""" +function is_successful(ltap::LowerTriangularAdaptivePoll) + return ltap.last_poll_improved +end +""" + get_descent_direction(ltap::LowerTriangularAdaptivePoll) + +Return the direction of the last [`LowerTriangularAdaptivePoll`](@ref) that yields a descent of the cost. +If the poll was not successful, the zero vector is returned +""" +function get_descent_direction(ltap::LowerTriangularAdaptivePoll) + return ltap.X +end +""" + get_basepoint(ltap::LowerTriangularAdaptivePoll) + +Return the base point of the tangent space, where the mash for the [`LowerTriangularAdaptivePoll`](@ref) is build in. +""" +function get_basepoint(ltap::LowerTriangularAdaptivePoll) + return ltap.base_point +end +""" + get_candidate(ltap::LowerTriangularAdaptivePoll) + +Return the candidate of the last successful [`LowerTriangularAdaptivePoll`](@ref). +If the poll was unsuccessful, the base point is returned. +""" +function get_candidate(ltap::LowerTriangularAdaptivePoll) + return ltap.candidate +end +""" + update_basepoint!(M, ltap::LowerTriangularAdaptivePoll, p) + +Update the base point of the [`LowerTriangularAdaptivePoll`](@ref). +This especially also updates the basis, that is used to build a (new) mesh. +""" +function update_basepoint!(M, ltap::LowerTriangularAdaptivePoll{P}, p::P) where {P} + vector_transport_to!( + M, ltap.X, ltap.base_point, ltap.X, p, ltap.vector_transport_method + ) + copyto!(M, ltap.base_point, p) + # reset candidate as well + copyto!(M, ltap.candidate, ltap.base_point) + return ltap +end +function show(io::IO, ltap::LowerTriangularAdaptivePoll) + s = """LowerTriangularAdaptivePoll + with + * basis on the tangent space: $(ltap.basis) + * retraction_method: $(ltap.retraction_method) + * vector_transport_method: $(ltap.vector_transport_method) + """ + return print(io, s) +end +function (ltap::LowerTriangularAdaptivePoll)( + amp::AbstractManoptProblem, + mesh_size::Real; + scale_mesh::Real=1.0, + max_stepsize::Real=Inf, +) + M = get_manifold(amp) + n = manifold_dimension(M) + l = -log(4, mesh_size) + S = (-2^l + 1):(2^l - 1) + if ltap.poll_counter <= l # we did not yet generate a b_l on this scale + ltap.poll_counter += 1 + # A random index ι + ltap.random_index = rand(1:n) + # generate a random b_l vector + rand!(ltap.random_vector, S) + ltap.random_vector[ltap.random_index] = rand((-2^l, 2^l)) + end #otherwise we already created ltap.random_vector for this mesh size + # Generate L lower triangular, (n-1)x(n-1) in M + for i in 1:(n - 1) + for j in 1:(n - 1) + if i < j + ltap.mesh[i, j] = rand(S) + elseif i == j + ltap.mesh[i, j] = rand((-2^l, 2^l)) + else + ltap.mesh[i, j] = 0 + end + end + end + # Shift to construct n × n matrix B + # (bottom left) + if n > 1 + ltap.mesh[(ltap.random_index + 1):n, (1:n)] = ltap.mesh[ + (ltap.random_index):(n - 1), (1:n) + ] + # zero row above + ltap.mesh[ltap.random_index, (1:n)] .= 0 + end + # second to last column: random vector + ltap.mesh[:, n] .= ltap.random_vector + # set last column to minus the sum. + ltap.mesh[:, n + 1] .= -1 .* sum(ltap.mesh[:, 1:n]; dims=2) + # Permute B (first n columns) + ltap.mesh[:, 1:n] .= ltap.mesh[:, randperm(n)] + # look for best + ltap.last_poll_improved = false + i = 0 + c = get_cost(amp, ltap.base_point) + while (i < (n + 1)) && !(ltap.last_poll_improved) + i = i + 1 # runs for the last time for i=n+1 and hence the sum. + # get vector – scale mesh + get_vector!( + M, + ltap.X, + ltap.base_point, + mesh_size * scale_mesh .* ltap.mesh[:, i], + ltap.basis, + ) + # shorten if necessary + ltap_X_norm = norm(M, ltap.base_point, ltap.X) + if ltap_X_norm > max_stepsize + ltap.X .*= max_stepsize / ltap_X_norm + end + retract!(M, ltap.candidate, ltap.base_point, ltap.X, ltap.retraction_method) + if get_cost(amp, ltap.candidate) < c + # Keep old point and search direction, since the update will come later only + # copyto!(M, ltap.base_point, ltap.candidate) + ltap.last_poll_improved = true + # this also breaks while + end + end + # clear temp vector if we did not improve – set to zero vector and “clear” candidate. + if !(ltap.last_poll_improved) + zero_vector!(M, ltap.X, ltap.base_point) + copyto!(M, ltap.candidate, ltap.base_point) + end + return ltap +end + +# +# Specific Searches +# + +""" + DefaultMeshAdaptiveDirectSearch <: AbstractMeshSearchFunction + +# Functor + + (s::DefaultMeshAdaptiveDirectSearch)(problem, mesh_size::Real, X; scale_mesh::Real=1.0, max_stepsize::Real=inf) + +# Fields + +* `q`: a temporary memory for a point on the manifold +* `X`: information to perform the search, e.g. the last direction found by poll. +* `last_search_improved::Bool` indicate whether the last search was successful, i.e. improved the cost. +$(_var(:Field, :retraction_method)) + +# Constructor + + DefaultMeshAdaptiveDirectSearch(M::AbstractManifold, p=rand(M); kwargs...) + +## Keyword arguments + +* `X::T=zero_vector(M, p) +$(_var(:Keyword, :retraction_method)) +""" +mutable struct DefaultMeshAdaptiveDirectSearch{P,T,RM<:AbstractRetractionMethod} <: + AbstractMeshSearchFunction + p::P + q::P + X::T + last_search_improved::Bool + retraction_method::RM +end +function DefaultMeshAdaptiveDirectSearch( + M::AbstractManifold, + p=rand(M); + X=zero_vector(M, p), + retraction_method::AbstractRetractionMethod=default_retraction_method(M), +) + return DefaultMeshAdaptiveDirectSearch(p, copy(M, p), X, false, retraction_method) +end +""" + is_successful(dmads::DefaultMeshAdaptiveDirectSearch) + +Return whether the last [`DefaultMeshAdaptiveDirectSearch`](@ref) was successful. +""" +function is_successful(dmads::DefaultMeshAdaptiveDirectSearch) + return dmads.last_search_improved +end +""" + get_candidate(dmads::DefaultMeshAdaptiveDirectSearch) + +Return the last candidate a [`DefaultMeshAdaptiveDirectSearch`](@ref) found +""" +function get_candidate(dmads::DefaultMeshAdaptiveDirectSearch) + return dmads.p +end +function show(io::IO, dmads::DefaultMeshAdaptiveDirectSearch) + s = """DefaultMeshAdaptiveDirectSearch + with + * retraction_method: $(dmads.retraction_method) + """ + return print(io, s) +end +function (dmads::DefaultMeshAdaptiveDirectSearch)( + amp::AbstractManoptProblem, + mesh_size::Real, + p, + X; + scale_mesh::Real=1.0, + max_stepsize::Real=Inf, +) + M = get_manifold(amp) + dmads.X .= (4 * mesh_size * scale_mesh) .* X + + dmads_X_norm = norm(M, p, dmads.X) + if dmads_X_norm > max_stepsize + dmads.X .*= max_stepsize / dmads_X_norm + end + retract!(M, dmads.q, p, dmads.X, dmads.retraction_method) + dmads.last_search_improved = get_cost(amp, dmads.q) < get_cost(amp, p) + if dmads.last_search_improved + copyto!(M, dmads.p, dmads.q) + end + return dmads +end + +""" + MeshAdaptiveDirectSearchState <: AbstractManoptSolverState + +# Fields + +$(_var(:Field, :p; add=[:as_Iterate])) +* `mesh_size`: the current (internal) mesh size +* `scale_mesh`: the current scaling of the internal mesh size, yields the actual mesh size used +* `max_stepsize`: an upper bound for the longest step taken in looking for a candidate in either poll or search +* `poll_size` +$(_var(:Field, :stopping_criterion, "stop")) +* `poll::`[`AbstractMeshPollFunction`]: a poll step (functor) to perform +* `search::`[`AbstractMeshSearchFunction`}(@ref) a search step (functor) to perform + +""" +mutable struct MeshAdaptiveDirectSearchState{ + P, + F<:Real, + PT<:AbstractMeshPollFunction, + ST<:AbstractMeshSearchFunction, + SC<:StoppingCriterion, +} <: AbstractManoptSolverState + p::P + mesh_size::F + scale_mesh::F + max_stepsize::F + poll_size::F + stop::SC + poll::PT + search::ST +end +function MeshAdaptiveDirectSearchState( + M::AbstractManifold, + p::P=rand(M); + mesh_basis::B=DefaultOrthonormalBasis(), + scale_mesh::F=injectivity_radius(M) / 2, + max_stepsize::F=injectivity_radius(M), + stopping_criterion::SC=StopAfterIteration(500) | StopWhenPollSizeLess(1e-7), + retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)), + vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method( + M, typeof(p) + ), + poll::PT=LowerTriangularAdaptivePoll( + M, + copy(M, p); + basis=mesh_basis, + retraction_method=retraction_method, + vector_transport_method=vector_transport_method, + ), + search::ST=DefaultMeshAdaptiveDirectSearch( + M, copy(M, p); retraction_method=retraction_method + ), +) where { + P, + F, + PT<:AbstractMeshPollFunction, + ST<:AbstractMeshSearchFunction, + SC<:StoppingCriterion, + B<:AbstractBasis, +} + poll_s = manifold_dimension(M) * 1.0 + return MeshAdaptiveDirectSearchState{P,F,PT,ST,SC}( + p, 1.0, scale_mesh, max_stepsize, poll_s, stopping_criterion, poll, search + ) +end +get_iterate(mads::MeshAdaptiveDirectSearchState) = mads.p + +function show(io::IO, mads::MeshAdaptiveDirectSearchState) + i = get_count(mads, :Iterations) + Iter = (i > 0) ? "After $i iterations\n" : "" + s = """ + # Solver state for `Manopt.jl`s mesh adaptive direct search + $Iter + + ## Parameters + * mesh_size: $(mads.mesh_size) + * scale_mesh: $(mads.scale_mesh) + * max_stepsize: $(mads.max_stepsize) + * poll_size: $(mads.poll_size) + * poll:\n $(replace(repr(mads.poll), "\n" => "\n ")[1:end-3]) + * search:\n $(replace(repr(mads.search), "\n" => "\n ")[1:end-3]) + + ## Stopping criterion + $(status_summary(mads.stop)) + """ + return print(io, s) +end + +get_solver_result(ips::MeshAdaptiveDirectSearchState) = ips.p + +function (c::StopWhenPollSizeLess)( + p::AbstractManoptProblem, s::MeshAdaptiveDirectSearchState, k::Int +) + if k == 0 # reset on init + c.at_iteration = -1 + end + c.last_poll_size = s.poll_size + if c.last_poll_size < c.threshold && k > 0 + c.at_iteration = k + return true + end + return false +end +function get_reason(c::StopWhenPollSizeLess) + if (c.last_poll_size < c.threshold) && (c.at_iteration >= 0) + return "The algorithm computed a poll step size ($(c.last_poll_size)) less than $(c.threshold).\n" + end + return "" +end +function status_summary(c::StopWhenPollSizeLess) + has_stopped = (c.at_iteration >= 0) + s = has_stopped ? "reached" : "not reached" + return "Poll step size s < $(c.threshold):\t$s" +end +function show(io::IO, c::StopWhenPollSizeLess) + return print(io, "StopWhenPollSizeLess($(c.threshold))\n $(status_summary(c))") +end diff --git a/src/plans/plan.jl b/src/plans/plan.jl index 9a5465654c..438891c02b 100644 --- a/src/plans/plan.jl +++ b/src/plans/plan.jl @@ -132,6 +132,7 @@ include("exact_penalty_method_plan.jl") include("frank_wolfe_plan.jl") include("interior_point_Newton_plan.jl") include("quasi_newton_plan.jl") +include("mesh_adaptive_plan.jl") include("nonlinear_least_squares_plan.jl") include("difference_of_convex_plan.jl") include("Douglas_Rachford_plan.jl") diff --git a/src/solvers/gradient_descent.jl b/src/solvers/gradient_descent.jl index 9cc434591a..ad9fafcfbc 100644 --- a/src/solvers/gradient_descent.jl +++ b/src/solvers/gradient_descent.jl @@ -179,7 +179,7 @@ function gradient_descent( return _ensure_matching_output(p, rs) end function gradient_descent( - M::AbstractManifold, mgo::O, p; kwargs... + M::AbstractManifold, mgo::O, p=rand(M); kwargs... ) where {O<:Union{AbstractManifoldGradientObjective,AbstractDecoratedManifoldObjective}} q = copy(M, p) return gradient_descent!(M, mgo, q; kwargs...) diff --git a/src/solvers/mesh_adaptive_direct_search.jl b/src/solvers/mesh_adaptive_direct_search.jl new file mode 100644 index 0000000000..859260b387 --- /dev/null +++ b/src/solvers/mesh_adaptive_direct_search.jl @@ -0,0 +1,155 @@ + +_doc_mads = """ + + mesh_adaptive_direct_search(M, f, p=rand(M); kwargs...) + mesh_adaptive_direct_search(M, mco::AbstractManifoldCostObjective, p=rand(M); kwargs..) + mesh_adaptive_direct_search!(M, f, p; kwargs...) + mesh_adaptive_direct_search!(M, mco::AbstractManifoldCostObjective, p; kwargs..) + + +# Input + +$(_var(:Argument, :M; type=true)) +$(_var(:Argument, :f)) +$(_var(:Argument, :p)) + +# Keyword arguments + +* `mesh_basis=`[`DefaultOrthonormalBasis`](@extref `ManifoldsBase.DefaultOrthonormalBasis`): + a basis to generate the mesh in. The mesh is generated in coordinates of this basis in every tangent space +* `max_stepsize=`$(_link(:injectivity_radius))`(M)`: a maximum step size to take. + any vector generated on the mesh is shortened to this length to avoid leaving the injectivity radius, +* `poll::`[`AbstractMeshPollFunction`](@ref)`=`[`LowerTriangularAdaptivePoll`](@ref)`(M, copy(M,p))`: + the poll function to use. The `mesh_basis` (as `basis`), `retraction_method`, and `vector_transport_method` are passed to this default as well. +$(_var(:Keyword, :retraction_method)) +* `scale_mesh=`$(_link(:injectivity_radius))`(M) / 4`: initial scaling of the mesh +* `search::`[`AbstractMeshSearchFunction`](@ref)`=`[`DefaultMeshAdaptiveDirectSearch`](@ref)`(M, copy(M,p))`: + the search function to use. The `retraction_method` is passed to this default as well. +$(_var(:Keyword, :stopping_criterion; default="[`StopAfterIteration`](@ref)`(500)`$(_sc(:Any))[`StopWhenPollSizeLess`](@ref)`(1e-10)`")) +$(_var(:Keyword, :vector_transport_method)) +$(_var(:Keyword, :X)) + +$(_note(:OtherKeywords)) + +$(_note(:OutputSection)) +""" + +@doc "$(_doc_mads)" +mesh_adaptive_direct_search(M::AbstractManifold, args...; kwargs...) + +function mesh_adaptive_direct_search(M::AbstractManifold, f, p=rand(M); kwargs...) + mco = ManifoldCostObjective(f) + return mesh_adaptive_direct_search(M, mco; kwargs...) +end +function mesh_adaptive_direct_search( + M::AbstractManifold, mco::AbstractManifoldCostObjective, p=rand(M); kwargs... +) + q = copy(M, p) + return mesh_adaptive_direct_search!(M, mco, q; kwargs...) +end + +@doc "$(_doc_mads)" +mesh_adaptive_direct_search!(M::AbstractManifold, args...; kwargs...) +function mesh_adaptive_direct_search!(M::AbstractManifold, f, p; kwargs...) + mco = ManifoldCostObjective(f) + return mesh_adaptive_direct_search!(M, mco, p; kwargs...) +end +function mesh_adaptive_direct_search!( + M::AbstractManifold, + mco::AbstractManifoldCostObjective, + p; + mesh_basis::B=DefaultOrthonormalBasis(), + scale_mesh::Real=injectivity_radius(M) / 4, + max_stepsize::Real=injectivity_radius(M), + stopping_criterion::StoppingCriterion=StopAfterIteration(500) | + StopWhenPollSizeLess(1e-10), + retraction_method::AbstractRetractionMethod=default_retraction_method(M, eltype(p)), + vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method( + M, eltype(p) + ), + poll::PT=LowerTriangularAdaptivePoll( + M, + copy(M, p); + basis=mesh_basis, + retraction_method=retraction_method, + vector_transport_method=vector_transport_method, + ), + search::ST=DefaultMeshAdaptiveDirectSearch( + M, copy(M, p); retraction_method=retraction_method + ), + kwargs..., #collect rest +) where {B<:AbstractBasis,PT<:AbstractMeshPollFunction,ST<:AbstractMeshSearchFunction} + dmco = decorate_objective!(M, mco; kwargs...) + dmp = DefaultManoptProblem(M, dmco) + madss = MeshAdaptiveDirectSearchState( + M, + p; + mesh_basis=mesh_basis, + scale_mesh=scale_mesh, + max_stepsize=max_stepsize, + stopping_criterion=stopping_criterion, + retraction_method=retraction_method, + vector_transport_method=vector_transport_method, + poll=poll, + search=search, + ) + dmadss = decorate_state!(madss; kwargs...) + solve!(dmp, dmadss) + return get_solver_return(get_objective(dmp), dmadss) +end + +# Init already do a poll, since the first search requires a poll +function initialize_solver!( + amp::AbstractManoptProblem, madss::MeshAdaptiveDirectSearchState +) + M = get_manifold(amp) + # do one poll step + madss.poll( + amp, madss.mesh_size; scale_mesh=madss.scale_mesh, max_stepsize=madss.max_stepsize + ) + if is_successful(madss.poll) + copyto!(M, madss.p, get_candidate(madss.poll)) + end + return madss +end +function step_solver!(amp::AbstractManoptProblem, madss::MeshAdaptiveDirectSearchState, k) + M = get_manifold(amp) + n = manifold_dimension(M) + # search if the last poll or last search was successful + if is_successful(madss.search) || is_successful(madss.poll) + madss.search( + amp, + madss.mesh_size, + get_candidate(madss.poll), + get_descent_direction(madss.poll); + scale_mesh=madss.scale_mesh, + max_stepsize=madss.max_stepsize, + ) + end + # For successful search, copy over iterate - skip poll, but update base + if is_successful(madss.search) + copyto!(M, madss.p, get_candidate(madss.search)) + update_basepoint!(M, madss.poll, madss.p) + else #search was not successful: poll + update_basepoint!(M, madss.poll, madss.p) + madss.poll( + amp, + madss.mesh_size; + scale_mesh=madss.scale_mesh, + max_stepsize=madss.max_stepsize, + ) + # For successful poll, copy over iterate + if is_successful(madss.poll) + copyto!(M, madss.p, get_candidate(madss.poll)) + end + end + # If neither found a better candidate -> reduce step size, we might be close already! + if !(is_successful(madss.poll)) && !(is_successful(madss.search)) + madss.mesh_size /= 4 + elseif madss.mesh_size < 0.25 # else + madss.mesh_size *= 4 # Coarsen the mesh but not beyond 1 + end + # Update poll size parameter + madss.poll_size = n * sqrt(madss.mesh_size) + return madss +end diff --git a/test/plans/test_mesh_adaptive_plan.jl b/test/plans/test_mesh_adaptive_plan.jl new file mode 100644 index 0000000000..1ad7e00e7a --- /dev/null +++ b/test/plans/test_mesh_adaptive_plan.jl @@ -0,0 +1,60 @@ +using ManifoldsBase, Manifolds, Manopt, Test, Random + +@testset "Test Mesh Adaptive Plan" begin + M = ManifoldsBase.DefaultManifold(3) + f(M, p) = norm(p) + mesh_size = 1.0 + cmp = DefaultManoptProblem(M, ManifoldCostObjective(f)) + @testset "Test Poll Accessors" begin + ltap = LowerTriangularAdaptivePoll(M, zeros(3)) + # On init there was not yet a check + @test !Manopt.is_successful(ltap) + @test Manopt.get_descent_direction(ltap) == ltap.X + @test Manopt.get_candidate(ltap) == ltap.candidate + p2 = [2.0, 0.0, 0.0] + @test Manopt.update_basepoint!(M, ltap, p2) === ltap + @test Manopt.get_basepoint(ltap) == p2 + @test startswith(repr(ltap), "LowerTriangularAdaptivePoll\n") + # test call + Random.seed!(42) + ltap(cmp, mesh_size) + # check that this was successful + @test Manopt.is_successful(ltap) + # test call2 scale down! + Random.seed!(42) + ltap(cmp, mesh_size; max_stepsize=1.0) + # check that this was successful as well + @test Manopt.is_successful(ltap) + #... and short enough + @test norm(M, p2, Manopt.get_descent_direction(ltap)) <= 1.0 + end + + @testset "Test Search Accessors" begin + p = ones(3) + dmads = DefaultMeshAdaptiveDirectSearch(M, p) + @test !Manopt.is_successful(dmads) + @test Manopt.get_candidate(dmads) == dmads.p + @test startswith(repr(dmads), "DefaultMeshAdaptiveDirectSearch\n") + X = -ones(3) + # This step would bring us to zero, but we only allow a max step 1.0 + dmads(cmp, 1.0, p, X; max_stepsize=1.0) + # and that should still improve + @test Manopt.is_successful(dmads) + end + + @testset "State Accessors" begin + p = ones(3) + mads = MeshAdaptiveDirectSearchState(M, p) + @test startswith( + repr(mads), "# Solver state for `Manopt.jl`s mesh adaptive direct search\n" + ) + @test get_iterate(mads) == p + @test get_solver_result(mads) == p + end + + @testset "Stopping Criteria" begin + sps = StopWhenPollSizeLess(1.0) + @test get_reason(sps) === "" + @test startswith(repr(sps), "StopWhenPollSizeLess(1.0)") + end +end diff --git a/test/runtests.jl b/test/runtests.jl index 80947d9fa1..3dbe0443d3 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -15,6 +15,7 @@ include("utils/example_tasks.jl") include("plans/test_cache.jl") include("plans/test_conjugate_residual_plan.jl") include("plans/test_interior_point_newton_plan.jl") + include("plans/test_mesh_adaptive_plan.jl") include("plans/test_nelder_mead_plan.jl") include("plans/test_nonlinear_least_squares_plan.jl") include("plans/test_gradient_plan.jl") @@ -55,6 +56,7 @@ include("utils/example_tasks.jl") include("solvers/test_gradient_descent.jl") include("solvers/test_interior_point_Newton.jl") include("solvers/test_Levenberg_Marquardt.jl") + include("solvers/test_mesh_adaptive_direct_search.jl") include("solvers/test_Nelder_Mead.jl") include("solvers/test_proximal_bundle_method.jl") include("solvers/test_proximal_point.jl") diff --git a/test/solvers/test_mesh_adaptive_direct_search.jl b/test/solvers/test_mesh_adaptive_direct_search.jl new file mode 100644 index 0000000000..29eeb7b9df --- /dev/null +++ b/test/solvers/test_mesh_adaptive_direct_search.jl @@ -0,0 +1,37 @@ +using Manifolds, Manopt, Test, LinearAlgebra, Random + +@testset "Mesh Adaptive Direct Search" begin + # A small spectral procrustes example + A = [1.0 0.0; 1.0 1.0; 0.0 1.0] + W = 1 / sqrt(2) .* [1.0 -1.0; 1.0 1.0] + B = A * W + M = Rotations(2) + p0 = [1.0 0.0; 0.0 1.0] + f(M, p) = opnorm(B - A * p) + Random.seed!(42) + s = mesh_adaptive_direct_search( + M, + f, + p0; + # debug=[:Iteration, :Cost, " ", :poll_size, " ", :mesh_size, " ", :Stop, "\n"], + return_state=true, + ) + @test distance(M, get_solver_result(s), W) < 1e-9 + @test startswith(get_reason(s), "The algorithm computed a poll step size") + # + # + # A bit larger example inplace + # A small spectral Procrustes example + A2 = [1.0 0.0 0.0; 1.0 1.0 1.0; 0.0 1.0 2.0; 1.0 1.0 1.0] + α = π / 8 + W2 = [cos(α) -sin(α) 0.0; sin(α) cos(α) 0.0; 0.0 0.0 1.0] + B2 = A2 * W2 + M2 = Rotations(3) + p1 = [1.0 0.0 0.0; 0.0 1.0 0.0; 0.0 0.0 1.0] + f2(M, p) = opnorm(B2 - A2 * p) + Random.seed!(42) + # start with a very small mesh size - yields a more exact result + p_s2 = mesh_adaptive_direct_search!(M2, f2, p1; scale_mesh=0.1) + @test isapprox(M, p_s2, p1) + @test distance(M2, p_s2, W2) < 1e-8 +end diff --git a/test/test_deprecated.jl b/test/test_deprecated.jl index 3b0b1585b4..b523a88aaf 100644 --- a/test/test_deprecated.jl +++ b/test/test_deprecated.jl @@ -1,3 +1,5 @@ using Manopt, Manifolds, ManifoldsBase, Test -@testset "test deprecated definitions still work" begin end +@testset "test deprecated definitions still work" begin + # after breaking releases this is usually empty. +end diff --git a/tutorials/AutomaticDifferentiation.qmd b/tutorials/AutomaticDifferentiation.qmd index 7056c84d7b..e01ef9561a 100644 --- a/tutorials/AutomaticDifferentiation.qmd +++ b/tutorials/AutomaticDifferentiation.qmd @@ -1,5 +1,5 @@ --- -title: "Using Automatic Differentiation in Manopt.jl" +title: "Using automatic differentiation in Manopt.jl" --- Since [Manifolds.jl](https://juliamanifolds.github.io/Manifolds.jl/latest/) 0.7, the support of automatic differentiation support has been extended. @@ -122,7 +122,7 @@ norm(M, p, X1 - X2) We obtain quite a good approximation of the gradient. -``## [2. Conversion of a Euclidean Gradient in the Embedding to a Riemannian Gradient of a (not Necessarily Isometrically) Embedded Manifold](@id EmbeddedGradient)``{=commonmark} +``## [2. Conversion of a Euclidean gradient in the embedding to a Riemannian Gradient of a (not Necessarily Isometrically) embedded manifold](@id EmbeddedGradient)``{=commonmark} Let $\tilde f: ℝ^m → ℝ$ be a function on the embedding of an $n$-dimensional manifold $\mathcal M \subset ℝ^m$and let $f: \mathcal M → ℝ$ denote the restriction of $\tilde f$ to the manifold $\mathcal M$. diff --git a/tutorials/HowToRecord.qmd b/tutorials/HowToRecord.qmd index 8a62a92606..a39be15b2c 100644 --- a/tutorials/HowToRecord.qmd +++ b/tutorials/HowToRecord.qmd @@ -15,7 +15,7 @@ Several predefined recordings exist, for example [`RecordCost`](@ref) or [`Recor For fields of the `State` the recording can also be done [`RecordEntry`](@ref). For other recordings, for example more advanced computations before storing a value, an own `RecordAction` can be defined. -We illustrate these using the gradient descent from the [Get started: optimize](https://manoptjl.org/stable/tutorials/getstarted.html) tutorial. +We illustrate these using the gradient descent from the [Get started: optimize](getstarted.md) tutorial. Here the focus is put on ways to investigate the behaviour during iterations by using Recording techniques. diff --git a/tutorials/ImplementASolver.qmd b/tutorials/ImplementASolver.qmd index ce89a6b6b8..99997238ff 100644 --- a/tutorials/ImplementASolver.qmd +++ b/tutorials/ImplementASolver.qmd @@ -4,7 +4,7 @@ author: "Ronny Bergmann" --- When you have used a few solvers from `Manopt.jl` for example like in the opening -tutorial [Get started: optimize](https://manoptjl.org/stable/tutorials/getstarted.html) +tutorial [Get started: optimize](getstarted.md) you might come to the idea of implementing a solver yourself. After a short introduction of the algorithm we aim to implement, @@ -204,7 +204,7 @@ In practice, however, it is preferable to cache intermediate values like cost of Now we can just run the solver already. We take the same example as for the other tutorials -We first define our task, the Riemannian Center of Mass from the [Get started: optimize](https://manoptjl.org/stable/tutorials/getstarted.html) tutorial. +We first define our task, the Riemannian Center of Mass from the [Get started: optimize](https://manoptjl.org/stable/tutorials/md) tutorial. ```{julia} #| output: false diff --git a/tutorials/ImplementOwnManifold.qmd b/tutorials/ImplementOwnManifold.qmd index 3c7d5c69bf..01dba3291e 100644 --- a/tutorials/ImplementOwnManifold.qmd +++ b/tutorials/ImplementOwnManifold.qmd @@ -80,7 +80,7 @@ struct ScaledSphere <: AbstractManifold{ℝ} end ``` -We would like to compute a mean and/or median similar to [🏔️ Get started with Manopt.jl!](https://manoptjl.org/stable/tutorials/getstarted.html). +We would like to compute a mean and/or median similar to [🏔️ Get started with Manopt.jl!](getstarted.md). For given a set of points $q_1,\ldots,q_n$ we want to compute [Karcher:1977](@cite) ```math diff --git a/tutorials/InplaceGradient.html b/tutorials/InplaceGradient.html new file mode 100644 index 0000000000..f668af1857 --- /dev/null +++ b/tutorials/InplaceGradient.html @@ -0,0 +1,703 @@ + + + + + + + + + + +Speedup using in-place evaluation + + + + + + + + + + + + + + + + + + + + + + + +
+ +
+ +
+
+

Speedup using in-place evaluation

+
+ + + +
+ +
+
Author
+
+

Ronny Bergmann

+
+
+ + + +
+ + + +
+ + +

When it comes to time critical operations, a main ingredient in Julia is given by mutating functions, that is those that compute in place without additional memory allocations. In the following, we illustrate how to do this with Manopt.jl.

+

Let’s start with the same function as in 🏔️ Get started with Manopt.jl and compute the mean of some points, only that here we use the sphere \(\mathbb S^{30}\) and \(n=800\) points.

+

From the aforementioned example.

+

We first load all necessary packages.

+
+
using Manopt, Manifolds, Random, BenchmarkTools
+using ManifoldDiff: grad_distance, grad_distance!
+Random.seed!(42);
+
+

And setup our data

+
+
Random.seed!(42)
+m = 30
+M = Sphere(m)
+n = 800
+σ = π / 8
+p = zeros(Float64, m + 1)
+p[2] = 1.0
+data = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];
+
+
+

Classical definition

+

The variant from the previous tutorial defines a cost \(f(x)\) and its gradient \(\operatorname{grad}f(p)\) ““”

+
+
f(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)
+grad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p)))
+
+
grad_f (generic function with 1 method)
+
+
+

We further set the stopping criterion to be a little more strict. Then we obtain

+
+
sc = StopWhenGradientNormLess(3e-10)
+p0 = zeros(Float64, m + 1); p0[1] = 1/sqrt(2); p0[2] = 1/sqrt(2)
+m1 = gradient_descent(M, f, grad_f, p0; stopping_criterion=sc);
+
+

We can also benchmark this as

+
+
@benchmark gradient_descent($M, $f, $grad_f, $p0; stopping_criterion=$sc)
+
+
+
BenchmarkTools.Trial: 86 samples with 1 evaluation.
+ Range (minmax):  52.720 ms93.308 ms   GC (min … max):  8.27% … 11.47%
+ Time  (median):     55.064 ms               GC (median):    10.10%
+ Time  (mean ± σ):   58.153 ms ±  7.376 ms   GC (mean ± σ):  10.36% ±  1.44%
+   ▃█▅                                                  
+  ▅████▅█▁▅▇▁▁▅▁▁▁▁▁▅▅▇▁▁▅▇▁▁▇▅▁▅▁▁▁▁▅▅▁▁▁▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▅ ▁
+  52.7 ms      Histogram: log(frequency) by time      84.9 ms <
+ Memory estimate: 173.54 MiB, allocs estimate: 1167345.
+
+
+
+
+
+

In-place computation of the gradient

+

We can reduce the memory allocations by implementing the gradient to be evaluated in-place. We do this by using a functor. The motivation is twofold: on one hand, we want to avoid variables from the global scope, for example the manifold M or the data, being used within the function. Considering to do the same for more complicated cost functions might also be worth pursuing.

+

Here, we store the data (as reference) and one introduce temporary memory to avoid reallocation of memory per grad_distance computation. We get

+
+
struct GradF!{TD,TTMP}
+    data::TD
+    tmp::TTMP
+end
+function (grad_f!::GradF!)(M, X, p)
+    fill!(X, 0)
+    for di in grad_f!.data
+        grad_distance!(M, grad_f!.tmp, di, p)
+        X .+= grad_f!.tmp
+    end
+    X ./= length(grad_f!.data)
+    return X
+end
+
+

For the actual call to the solver, we first have to generate an instance of GradF! and tell the solver, that the gradient is provided in an InplaceEvaluation. We can further also use gradient_descent! to even work in-place of the initial point we pass.

+
+
grad_f2! = GradF!(data, similar(data[1]))
+m2 = deepcopy(p0)
+gradient_descent!(
+    M, f, grad_f2!, m2; evaluation=InplaceEvaluation(), stopping_criterion=sc
+);
+
+

We can again benchmark this

+
+
@benchmark gradient_descent!(
+    $M, $f, $grad_f2!, m2; evaluation=$(InplaceEvaluation()), stopping_criterion=$sc
+) setup = (m2 = deepcopy($p0))
+
+
+
BenchmarkTools.Trial: 135 samples with 1 evaluation.
+ Range (minmax):  35.592 ms59.467 ms   GC (min … max): 0.00% … 0.00%
+ Time  (median):     36.393 ms               GC (median):    0.00%
+ Time  (mean ± σ):   37.177 ms ±  3.086 ms   GC (mean ± σ):  0.64% ± 2.40%
+  ▄█                                                        
+  ██▇▆▇▅▅▆▅▁▅▁▁▁▁▁▁▁▁▁▁▆▁▁▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ ▅
+  35.6 ms      Histogram: log(frequency) by time      57.8 ms <
+ Memory estimate: 3.59 MiB, allocs estimate: 6860.
+
+
+
+

which is faster by about a factor of 2 compared to the first solver-call. Note that the results m1 and m2 are of course the same.

+
+
distance(M, m1, m2)
+
+
4.8317610992693745e-11
+
+
+
+
+

Technical details

+

This tutorial is cached. It was last run on the following package versions.

+
+
+Code +
using Pkg
+Pkg.status()
+
+
+
Status `~/Repositories/Julia/Manopt.jl/tutorials/Project.toml`
+⌃ [6e4b80f9] BenchmarkTools v1.5.0
+⌃ [5ae59095] Colors v0.12.11
+⌃ [31c24e10] Distributions v0.25.113
+  [26cc04aa] FiniteDifferences v0.12.32
+  [7073ff75] IJulia v1.26.0
+  [8ac3fa9e] LRUCache v1.6.1
+⌅ [af67fdf4] ManifoldDiff v0.3.13
+⌃ [1cead3c2] Manifolds v0.10.7
+⌃ [3362f125] ManifoldsBase v0.15.22
+  [0fc0a36d] Manopt v0.5.5 `~/Repositories/Julia/Manopt.jl`
+  [91a5bcdd] Plots v1.40.9
+⌃ [731186ca] RecursiveArrayTools v3.27.4
+Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated`
+
+
+
+
+Code +
using Dates
+now()
+
+
+
2025-02-04T17:30:59.608
+
+
+
+ +
+ + +
+ + + + + \ No newline at end of file diff --git a/tutorials/Project.toml b/tutorials/Project.toml index 40a9e920d9..47cb993c67 100644 --- a/tutorials/Project.toml +++ b/tutorials/Project.toml @@ -14,6 +14,7 @@ Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80" RecursiveArrayTools = "731186ca-8d62-57ce-b412-fbd966d074cd" [compat] +ADTypes = "1" BenchmarkTools = "1" Colors = "0.12, 0.13" Distributions = "0.25"